# HDFS Writer

HDFS stands for **Hadoop Distributed File System**. It is a distributed file system designed to store and manage large data sets in a reliable, fault-tolerant, and scalable way. HDFS is a core component of the Apache Hadoop ecosystem and is used by many big data applications.

This component writes the data in HDFS(**Hadoop Distributed File System).**

All component configurations are classified broadly into 3 sections

* [​Basic Information​](https://docs.bdb.ai/data-pipeline/components/component-base-configuration)​
* Meta Information
* ​[Resource Configuration](https://docs.bdb.ai/7.6/data-pipeline/components/resource-configuration)​

{% hint style="success" %}
*Follow the given steps in the demonstration to configure the HDFS Writer component.*
{% endhint %}

<figure><img src="https://859511478-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGDmsjfjJBNqow7Fo97cO%2Fuploads%2FD67m1as0D8gKMg2foQSO%2FHDFS_writer.gif?alt=media&#x26;token=60b90fb7-3a91-4158-af21-c4a5b1bd930c" alt=""><figcaption><p>Configuring the HDFS Writer component​</p></figcaption></figure>

## **Configuring the Meta Information tab of the HDFS Writer**

* **Host IP Address:** Enter the host IP address for HDFS.
* **Port:** Enter the Port.
* **Table:** Enter the table name where the data has to be written.
* **Zone:** Enter the Zone for HDFS in which the data has to be written. Zone is a special directory whose contents will be transparently encrypted upon write and transparently decrypted upon read.
* **File Format:** Select a file format in which the data has to be written.
  * ***CSV***
  * ***JSON***
  * ***PARQUET***
  * ***AVRO***
* **Save Mode:** Select a Save Mode.
* **Schema file name:** Upload Spark schema file in JSON format.
* **Partition Columns**: Provide a unique Key column name to partition data in Spark.
