A summary of the functional requirements for Enterprise Datawarehouse is as follows:
The Data Platform is required to be hosted on a cloud-based environment accessible to the client’s business users globally.
The Data Platform should be able to support multi-cloud integration and integration with On-Premises applications/databases hosted at Manufacturing sites and 3rd party applications as well as applications hosted in the cloud.
The Data Platform should have out-of-the-box proven integration capabilities with data streaming applications such as Kafka.
The Data Platform should have out-of-the-box proven integration capabilities with advanced analytics (machine learning) tools/applications including Open Source.
The client’s Business users should be able to remotely access the data platform and analyze the data and reports with reduced latency.
The data platform should be capable of managing a large number of concurrent users accessing the data platform without any performance issues. The data platform should be available 24/7 with 99.5% availability with no data loss. Services shall be available across regions with 'zero' downtime.
Capability to ingest, curate, process and store various types of data from various clients as well as 3rd party applications in real/near-real time and batch modes both On-Premises and Cloud based on the data lake of the choice of the client.
Ability to perform data profiling, apply data quality checks, and do data cleansing on the fly.
Cost-effective compute and storage capacity to process and store different formats of data which is scalable automatically with ‘zero downtime’.
The data platform should be capable to provide an enterprise view of data across the organization which can be seamlessly accessed by Business users through Self-Service reporting.
Provide a centralized data repository (Sand Box environment) with a robust data foundation to enable data discovery, data mining, and advanced analytics.
The Data Platform should be secure scalable, fault-tolerant, and highly available.
Shift from a de-centralized reporting and data silos to a governed centralized reporting framework.
Data Platform shall support full-text search and faceted text search capabilities. Ability to search both structured and unstructured data from both data lake and EDW.
Last updated 2 years ago