This section focuses on the Configuration tab provided for any Pipeline component.
For each component that gets deployed, we have an option to configure the resources i.e., Memory and CPU.
We have two deployment types:
Docker
Spark
Go through the given illustration to understand how to configure a component using the Docker deployment type.
After we save the component and pipeline, the component gets saved with the default configuration of the pipeline i.e., Low, Medium, and High. After we save the pipeline, we can see the configuration tab in the component. There are multiple things.
For the Docker components, we have the Request and Limit configurations.
We can see the CPU and Memory options to be configured.
CPU: This is the CPU configuration where we can specify the number of cores that we need to assign to the component.
Please Note: 1000 means 1 core in the configuration of docker components. When we put 100 that means 0.1 core has been assigned to the component.
Memory: This option is to specify how much memory you want to dedicate to that specific component.
Please Note: 1024 means 1GB in the configuration of the docker components.
Instances: The number of instances is used for parallel processing. If we give N no. of instances those many pods will get deployed.
Go through the below given walk-through to understand the steps to configure a component with Spark configuration type.
The Spark Components configuration is slightly different from the Docker components. When the spark components are deployed, there are two pods that come up:
Driver
Executor
Provide the Driver and executor configurations separately.
Instances: The number of instances is used for parallel processing. If we give N no. of instances in executors configuration those many executors pods will get deployed.
Please Note: Till the current release, the minimum requirement to deploy a driver is 0.1 Cores and 1 core for the executor. It can change with the upcoming versions of Spark.