DB Writer

The DB reader is a spark-based writer component which gives you capability to write data to multiple database sources.

All component configurations are classified broadly into the following sections:

Please check out the given demonstration to configure the component.

Drivers Available

  • MySQL

  • Oracle

  • PostgreSQL

  • MS-SQL

  • ClickHouse

  • Snowflake

Please Note:

  • The ClickHouse driver in the Spark components will use HTTP Port and not the TCP port.

  • It is always recommended to create the table before activating the pipeline to avoid errors as RDBMS has a strict schema and can result in errors.

Save Modes

The RDBMS writer supports 3 save modes:

Append

As the name suggests it adds all the records without any validations.

Overwrite

This mode truncates the table and adds fresh records. after every run you will get records that are part of the batch process.

Upsert

This operation allows the users to insert a new record or update existing data into a table. For configuring this we need to provide the Composite Key.

The BDB Data Pipeline supports composite key based upsert, in case of composite key, we can specify the second key by using comma separator e.g., key1, key2​. It has now an option to upload the spark schema in JSON format. This can greatly improve the speed of the write operation as the component will ignore inferring schema and go with the provided schema.

Please Note: For ClickHouse Component Upsert is comparatively slow. It is preferable to create a table where the engine is ReplacingMergeTree and a view where we load the view with the Final clause. In the component keep the write mode to Append.

  • Database name: Enter the Database name.

  • Table name: Provide a table name where the data has to be written.

  • Enable SSL: Check this box to enable SSL for this components. Enable SSL feature in DB reader component will appear only for three(3) drivers: Mongodb, PostgreSQL and ClickHouse.

  • Certificate Folder: This option will appear when the Enable SSL field is checked-in. The user has to select the certificate folder from drop down which contains the files which has been uploaded to the admin settings. Please refer the below given images for the reference.

  • Schema File Name: Upload the Spark Schema in JSON format.

  • Query: In this field, we can write a DDL for creating the table in database where the in-event data has to be written. For example, please refer the below image:

Last updated