Ingestion Process
Last updated
Last updated
Change data capture
medium
Integration with thirdparty tools like Debezium can be provided
Scheduled ingestion
very high
Yes, it is avaliable off-the-shelf.
Minimise fields
very high
Field minimization can be defined through the Data Preparation tool, and published to be used in the live Data Pipelines.
Filter by lookup
very high
Yes, it is a standard component.
Filter by consent
very high
It can be achieved via API integration with consent system, or through consent database lookup.
Anonymise fields
very high
Standard anonymization available via Data Preparation option or the Spark SQL component.
Compose Ingestion Processors
very high
Drag and drop based low-code platform
Ingestion Fault tolerant
very high
Ability to track faults and initiate sub process
Bootstrap + updates
very high
Can define pipeline to load historic data and subsequent updates as per the data load strategy
Reports + Metrics
high
Data Pipeline generates metric report about every process, like Memory used, CPU used, no. of records processed, etc.
performance impact threshold
high
Configurable compute resource allocation and instances to scale up
Secrety Management integration
very high
All secrets are stored in the Kubernestes secrets, platform provides direct integration with this.
Data Catalogue integration
very high
Platform automatically generate data catalog from the underlying meta data.
Visual interface
very high
Pipeline studio has drag and drop based visual interface, based on No-code/low-code approach.
Ingestion Manifest file
very high
It is achievable via internal metadata.
CI/CD Pipelines Integration
high
Yes, it provides facility to check-in and check-out Pipeline definitions and metadata to GIT Lab.
Ingestion Access Management
very high
Data Pipeline supports RBAC.
Ingestion Audit Logs
very high
Logs can be pushed to thirdparty log monitoring systems like Datadog, Promethues, etc.