Azure Writer

The Azure Blob Writer component writes data into Azure Blob Storage. It supports multiple file formats, partitioning, and save modes. Authentication is handled using either Storage Account Key or Azure AD Service Principal (Client Secret) credentials.

Configuration Sections

The Azure Blob Writer configurations are organized into the following sections:

Basic Information
Meta Information
Resource Configuration
Connection Validation

Authentication Methods

1. Write Using Secret Key

Authenticate using the Storage Account Key.

Parameter

Description

Example

Account Key

Storage account key used to authenticate.

xxxx12345...

Account Name

Name of the Azure storage account.

mystorageacct

Container

Name of the target container.

sales-data

Blob Name

Target blob name (path + file).

transactions/2025-01-01.csv

File Format

Output file type: CSV, JSON, PARQUET, AVRO.

PARQUET

Save Mode

Write behavior: Append, Overwrite.

Overwrite

Schema File Name

Spark schema file in JSON format.

schema.json

Column Filter

Define which columns to write. See Column Filtering.

N/A

Partition Column

Partition data by column(s).

date, region

2. Write Using Principal Secret

Authenticate using Azure Active Directory Service Principal.

Parameter

Description

Example

Client ID

Application (client) ID from Azure AD.

2c76b0a9-xxxx-xxxx-xxxx-abcdef

Tenant ID

Directory (tenant) ID from Azure AD.

72f988bf-xxxx-xxxx-xxxx-abcdef

Client Secret

Secret key of the registered application.

********

Account Name

Name of the Azure storage account.

mystorageacct

Container

Name of the target container.

finance-data

Blob Name

Target blob name.

monthly/summary.parquet

File Format

Output file type: CSV, JSON, PARQUET, AVRO.

JSON

Save Mode

Write behavior: Append, Overwrite.

Append

Schema File Name

Spark schema file in JSON format.

finance_schema.json

Column Filter

Define which columns to write. See Column Filtering.

N/A

Partition Column

Partition data by column(s).

year, department

Save Modes

Mode

Description

Append

Adds new data to the existing blob.

Overwrite

Replaces the blob contents with new data.

Column Filtering

The Column Filter section allows you to select and rename columns before writing to Azure Blob.

Field

Description

Example

Name

Name of the column from upstream data.

customer_id

Alias

Alias name to use in the container.

cust_id

Column Type

Data type of the column.

STRING

Additional Options:

Upload: Upload CSV/JSON/Excel to auto-populate schema.
Download Data: Export schema mapping in JSON format.
Delete Data: Clear all column filter entries.

Partitioning

Partitioning organizes data in the container by column values, improving query performance and management.

Example: Partition by date

azure://mystorageacct/sales-data/date=2025-01-01/
azure://mystorageacct/sales-data/date=2025-01-02/
azure://mystorageacct/sales-data/date=2025-01-03/

Notes

Ensure the service principal or account has proper RBAC roles (e.g., Storage Blob Data Contributor) on the target storage account.
Prefer Parquet or Avro for production workloads due to better compression and schema support.
Use partitioning for large datasets to improve performance and data organization.
For secure deployments, store Account Keys and Client Secrets in Azure Key Vault.

PreviousVideo Writer NextClickHouse Writer (Docker)