Pre Sales
  • BDB Pre Sales
  • Manufacturing Use Case
    • Introduction
    • How is BDB different than Azure, AWS, or GCP?
    • Project Definition and Requirements
      • Functional Requirements
      • Technical Requirements
      • Non-Functional Requirements
      • Project Deliverables
    • Functional Requirements from Manufacturing
    • Technical Requirements
      • Data Ingestion
      • Data Processing (Batch Data)
      • Data Processing (Real-Time Data)
      • Data Preparation
      • Data Store(Data Lake)
      • Data Store (Enterprise Datawarehouse)
      • Query Engine
      • Data Visualization
      • BDB Search
      • Advanced Analytics and Data Science
    • Data Services
    • Security Requirements
    • Networking Requirements
    • Operational Requirement
    • Non-Functional Requirements
      • Scalability
      • Availability
    • Data Platform Benchmarking
    • Hardware Sizing Requirements
  • Data Platform Evaluation Criteria
    • Data Preparation
    • Data Platform Evaluation Highlights
    • Data Pipeline
    • Ingestion Connector
      • Seamless Handling of Data ops and ML ops
    • Ingestion Process
      • Building a path from ingestion to analytics
    • Data Preparation
      • Processing Modern Data Pipeline
  • BDB POC Approach
  • BDB Vertical Analytics
  • Technical FAQs
    • Data Platform
    • Administration
    • Data Security & Privacy
    • Analytics
    • Data Preparation
    • Data Pipeline
    • Dashboard Designer
    • Business Story
    • Performance & Scalability
    • Global and Embeddable
    • Deployability
    • User Experience
    • Support & Licensing
    • AI
    • Change Management
Powered by GitBook
On this page
  1. Data Platform Evaluation Criteria

Ingestion Process

PreviousSeamless Handling of Data ops and ML opsNextBuilding a path from ingestion to analytics

Last updated 2 years ago

Requirement
Evaluation
Remarks

Change data capture

medium

Integration with thirdparty tools like Debezium can be provided

Scheduled ingestion

very high

Yes, it is avaliable off-the-shelf.

Minimise fields

very high

Field minimization can be defined through the Data Preparation tool, and published to be used in the live Data Pipelines.

Filter by lookup

very high

Yes, it is a standard component.

Filter by consent

very high

It can be achieved via API integration with consent system, or through consent database lookup.

Anonymise fields

very high

Standard anonymization available via Data Preparation option or the Spark SQL component.

Compose Ingestion Processors

very high

Drag and drop based low-code platform

Ingestion Fault tolerant

very high

Ability to track faults and initiate sub process

Bootstrap + updates

very high

Can define pipeline to load historic data and subsequent updates as per the data load strategy

Reports + Metrics

high

Data Pipeline generates metric report about every process, like Memory used, CPU used, no. of records processed, etc.

performance impact threshold

high

Configurable compute resource allocation and instances to scale up

Secrety Management integration

very high

All secrets are stored in the Kubernestes secrets, platform provides direct integration with this.

Data Catalogue integration

very high

Platform automatically generate data catalog from the underlying meta data.

Visual interface

very high

Pipeline studio has drag and drop based visual interface, based on No-code/low-code approach.

Ingestion Manifest file

very high

It is achievable via internal metadata.

CI/CD Pipelines Integration

high

Yes, it provides facility to check-in and check-out Pipeline definitions and metadata to GIT Lab.

Ingestion Access Management

very high

Data Pipeline supports RBAC.

Ingestion Audit Logs

very high

Logs can be pushed to thirdparty log monitoring systems like Datadog, Promethues, etc.