S3 Writer
The S3 Writer task is used to write data into an Amazon S3 bucket in structured formats for downstream analytics and processing.
Prerequisites
An AWS account with sufficient permissions (
s3:PutObject
,s3:ListBucket
) on the target bucket.An S3 bucket created in the desired Region.
A valid AWS Access Key and Secret Key, or an equivalent IAM role configured for the job runtime.
A defined schema (JSON) when writing CSV or JSON to avoid type drift.
Configuring the Meta Information Tab
To configure the S3 Writer:
Drag the S3 Writer task to the Workspace.
Click the task to open its configuration tabs.
The Meta Information tab opens by default. Configure the following fields:
Bucket Name
Enter the name of the Amazon S3 bucket. Do not include the s3://
prefix.
Region
Provide the AWS region of the bucket (for example, us-east-1
).
Access Key
AWS Access Key ID used for authentication.
Secret Key
AWS Secret Access Key corresponding to the Access Key.
Table
Enter the target object name or prefix where the data will be written. For distributed writes, provide a directory-like prefix (e.g., sales/2025/09/
).
File Type
Select the file type: CSV, JSON, PARQUET, or AVRO.
Save Mode
Select the save mode: <ul><li>Append — Add new files to the specified path.</li><li>Overwrite — Replace existing files at the specified path.</li></ul>
Schema File Name
Upload a Spark schema file in JSON format. Recommended for CSV/JSON to enforce consistent datatypes.
Best Practices
Prefer columnar formats (Parquet, Avro) for analytics and cost-efficient storage.
Partition outputs by time or business keys (e.g.,
year=2025/month=09/
) to simplify downstream queries.Ensure IAM credentials are rotated regularly, or use role-based access (IAM role, instance profile).
Test with a small sample before large production writes.
Save & Next Steps
After configuring, click Save Task In Storage to persist your configuration.
Run a test job to validate credentials, schema, and file layout.
Monitor task logs to confirm successful writes to the target S3 location.