Data Pipeline
Last updated
Last updated
Yes, there is an option for R Scripting in the BDB Data Pipeline module.
Yes, the BDB Data Pipeline enables to read XML and JSON data from the AWS S3, Desktop, Folder on a server.
BDB Pipeline enables data engineers to configure the resource allocation to different components based on the load and operation. It allows the creation of multiple instances to run in parallel to complete the operation in the desired time frame. Also, the BDB Data Pipeline module has an integrated data pipeline monitoring section that showcases important performance statistics such as memory & CPU utilization, no. of records processed, allocated CPU & memory, last processed count, total no. of records processed, last processed record size, no. of instances with each component names associated with the selected pipeline. Along with visualizing the key parameters, there is a component log also available.
BDB Data Pipeline handles incremental data extraction to the BDB Data Store. The pipeline has a built-in scheduler component to run at the scheduled interval to capture the increments. It also enables capturing the changed steam.
BDB Data Pipeline has a built-in scheduler component to schedule the data refresh.
The performance tuning options offered by the BDB Platform are as follows: Data Pipeline: The users can monitor data workflows through the monitoring feature provided under the BDB Data Pipeline modules. It provides them with a variety of performance-tuning options:
Optimizing resource utilization
Detecting and preventing errors
Improving the efficiency of pipelines
Ensuring data quality
Increasing transparency
E.g., If a user is writes MongoDB queries within the BDB Data Set and the query response time is prolonged, the user can implement performance tuning options such as aggregated views, indexing, sharding, etc. to enhance the query execution performance.