# Export to Pipeline

{% hint style="success" %}
*Check out the walk-through on how to export a Notebook script to the Data Pipeline module.*
{% endhint %}

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Figrg2b2URgwMO5YmejDu%2Fuploads%2FDistQsn2yC31aq5k07p6%2FPySpark_Export%20New%20(1).mp4?alt=media&token=3604ae9b-feef-4710-ac10-3d2b61272e45>" %}
Exporting a PySpark Notebook to the Data Pipeline module
{% endembed %}

* Navigate to the ***Notebook list***.
* Click the ***Export to Pipeline*** icon for a Notebook.     &#x20;

<figure><img src="/files/iqWKAyJ2ySQcrSeiE7Up" alt=""><figcaption></figcaption></figure>

* The ***Export to Pipeline*** dialog box opens.
* Select a specific function using the checkbox.
* Click the ***Next*** option.

<figure><img src="/files/WzxtseUOQ1wZwy62BoQr" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note</mark>: The user must write a function to use **Export to Pipeline** functionality.*&#x20;
{% endhint %}

* Click the ***Export*** option from the next page that opens for the Pipeline Export.

<figure><img src="/files/B5z4e1tDaUQ0FC5zRPvv" alt=""><figcaption></figcaption></figure>

* A confirmation message appears informing the completion of the action.

<figure><img src="/files/zJnQ5mvyQUWdWytgveff" alt=""><figcaption><p>Notification message after Notebook gets exported.</p></figcaption></figure>

* Navigate to a ***Pipeline*** ***homepage***.
* Click the ***Create Job*** option.

<figure><img src="/files/9KYVUcaDjTqbK7qLBjyv" alt=""><figcaption></figcaption></figure>

* The ***New Job*** dialog window opens.
* Provide the required information to create a new job.
  * Enter name: Provide name for the new job.
  * Job Description: Enter the description for the new job.
  * Job Baseinfo: Select the ***PySpark Job*** option using the drop-down.
  * **Trigger By:** The PySpark Job can be triggered by another Job or PySpark Job. The PySpark Job can be triggered in two scenarios from another jobs:
    * **On Success:** Select a job from drop-down. Once the selected job is run successfully, it will trigger the PySpark Job.
    * **On Failure:** Select a job from drop-down. Once the selected job gets failed, it will trigger the PySpark Job.
  * **Is Scheduled**: Put a check mark in the given box to schedule the new Job.
  * **Spark config**: Select resource for the new Job.
* Click the ***Save*** option.

<figure><img src="/files/NmnrOG1mJNPZmEv4Hx33" alt=""><figcaption></figcaption></figure>

* A notification message appears and the new Job gets created.
* The recently created Job appears dragged to the ***Job Editor workspace*** by default.

<figure><img src="/files/6G5QFy39nukSgccchavH" alt=""><figcaption><p>Job Editor Workspace</p></figcaption></figure>

* Click on the Job component to open the configuration tabs.

<figure><img src="/files/iufbw9lbSk1spCHw2KdV" alt=""><figcaption></figcaption></figure>

* Open the ***Meta Information*** tab of the ***PySpark Job*** component.
* **Project Name:** Select the same Project using the drop-down menu where the concerned Notebook has been created.
* **Script Name:** Select the script which has been exported from notebook in DS Lab module. The script written in DS Lab module should be inside a function.
* **External Library:** If any external libraries used in the script we can mention here. We can mention multiple libraries by giving comma(,) in between the names.
* **Start Function:** Select the function name in which the script has been written.
* **Script:** The Exported script appears under this space.
* **Input Data:** If any parameter has been given in the function, then the name of the parameter is provided as **Key** and value of the parameters has to be provided as **value** in this field.
* Click the ***Save component in the storage*** to use the PySpark component in a workflow inside the Data Pipeline module.

<figure><img src="/files/MAWjc9mSuJolLvMLJBrN" alt=""><figcaption><p>Consuming the Exported Data Science Script to the PySpark component.</p></figcaption></figure>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> Refer the **Data Science Lab Quick Start Flow** page to get an overview of the **Data Science Lab** module in nutshell.* [***Click here***](/data-science-lab-2/data-science-lab-quick-start-flow.md) *to get redirected to the quick start flow page.*
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.bdb.ai/data-science-lab-2/project/tabs-for-a-data-science-lab-project/tabs-for-pyspark-environment/notebook/notebook-list-page/export/export-to-pipeline.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
