Azure Blob Reader (Docker)
The Azure Blob Reader (Docker) component reads data stored in Azure Blob Storage. It runs as a Docker-based component and supports multiple authentication mechanisms for secure access. Files can be ingested in common formats including CSV, JSON, Parquet, Avro, and XML.
Configuration Sections
The Azure Blob Reader component configurations are organized into the following sections:
Basic Information
Meta Information
Resource Configuration
Connection Validation
Authentication Methods
The component supports three authentication methods for connecting to Azure Blob Storage:
Shared Access Signature (SAS) – Recommended for temporary, revocable access
Secret Key (Storage Account Key) – Full account access; use with caution
Principal Secret (Azure AD Service Principal) – Enterprise-grade, app-based access
⚠️ Security Best Practices
Prefer SAS tokens for temporary and granular access.
Store Secret Keys and Principal Secrets securely in Azure Key Vault.
Avoid hardcoding credentials in pipelines.
1. Using Shared Access Signature (SAS)
Shared Access Signature
SAS URI granting restricted access to storage resources.
?sv=2025-01-01&ss=b&srt=...
Account Name
Azure storage account name.
myazureaccount
Container
Name of the container.
sales-data
File Type
File type: CSV
, JSON
, PARQUET
, AVRO
, XML
.
CSV
Read Directory
If enabled, reads all blobs in the container.
true
(default)
Blob Name
Specific blob to read (if Read Directory is disabled).
transactions.csv
Column Filter
Filter columns with alias and data type.
See Column Filtering.
2. Using Secret Key
Account Key
Storage account key for Shared Key authorization.
xxxx12345...
Account Name
Azure storage account name.
myazureaccount
Container
Name of the container.
finance-data
File Type
File type: CSV
, JSON
, PARQUET
, AVRO
.
PARQUET
Read Directory
If enabled, reads all blobs in the container.
true
Blob Name
Specific blob to read (if Read Directory is disabled).
q1_data.json
Column Filter
Filter columns with alias and type.
See Column Filtering.
3. Using Principal Secret
Client ID
Application (client) ID from Azure AD.
2c76b0a9-xxxx-xxxx-xxxx-abcdef
Tenant ID
Directory (tenant) ID of your Azure AD instance.
72f988bf-xxxx-xxxx-xxxx-abcdef
Client Secret
Secret key of the service principal.
********
Account Name
Azure storage account name.
myazureaccount
File Type
File type: CSV
, JSON
, PARQUET
, AVRO
.
JSON
Read Directory
If enabled, reads all blobs in the container.
true
Blob Name
Specific blob to read (if Read Directory is disabled).
archive.zip
Column Filter
Filter columns with alias and type.
See Column Filtering.
File Type-Specific Behavior
CSV
Header: Use the first row as column headers.
Infer Schema: Automatically detect schema.
JSON
Multiline: Enable for multiline JSON records.
Charset: Character encoding (
UTF-8
,ISO-8859-1
).
PARQUET
No additional configuration required.
AVRO
Compression: Options:
Snappy
(default),Deflate
.Compression Level: Available if
Deflate
is selected (0–9).
XML
Root Tag: Root element of the XML.
Row Tags: Defines row-level elements.
Join Row Tags: Enable to combine multiple row tags.
Infer Schema: Automatically detect schema from XML structure.
Column Filtering
The Column Filter section allows selecting specific columns instead of retrieving the entire dataset.
Source Field
Column name from the blob.
customer_id
Destination Field
Alias name for the column.
cust_id
Column Type
Data type of the column.
STRING
Additional Options:
Upload File: Upload CSV/JSON/Excel to auto-populate schema.
Download Data: Export schema in JSON.
Delete Data: Clear schema configuration.
Notes
SAS tokens are recommended for temporary access with fine-grained control.
Secret Keys grant full control; use only when required and secure in Azure Key Vault.
Principal Secret authentication is best for enterprise-scale applications with Azure AD.
For JSON and CSV files, schema inference may add processing overhead; consider providing explicit schemas for production workloads.