# Data Science Lab Quick Start Flow

Data Science module allows the user to create Data Science Experiments and productionize them. This page tries to provide the entire Data Science flow in nutshell for the user to quickly begin their Data Science experiment journey. &#x20;

## Project Creation&#x20;

A Data Science Project created inside the Data Science Lab is like a Workspace inside which the user can create and store multiple data science experiments.

{% hint style="info" %}
*<mark style="color:green;">Pre-requisite:</mark> It is mandatory to configure the **DS Lab Settings** option before beginning with the Data Science Project creation. Also, select the algorithms by using the **Algorithms** field from the* [***DS Lab Settings*** ](https://docs.bdb.ai/administrative-settings-4/admin-panel-options/configurations/data-science-lab-settings)*section which you wish to use inside your Data Science Lab project.*
{% endhint %}

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FBLGYLEkBUnc8nVEBAuEI%2Fuploads%2FxhNlzkUWxT2c2gw8IIkp%2FCreate%20a%20DSL%20Project%20V4.mp4?alt=media&token=6d7014da-b7ce-401a-976c-d8ed551bc0af>" %}
***Creating a DSL Project***
{% endembed %}

<details>

<summary>Creating a New Project</summary>

Open the Data Science Lab module and access the ***Create Project*** option to begin with the Project creation. Refer the [***Creating a Project***](https://docs.bdb.ai/data-science-lab-3/project/creating-a-project) page to understand the steps involved in the Project Creation in details.

</details>

<details>

<summary>Project List</summary>

Once a Data Science Project gets created it gets listed under the Projects page. Each Project in the list gets the following Actions to be applied on it:

1. [**View**](https://docs.bdb.ai/data-science-lab-3/project/project-list/view)
2. [**Push to VCS**](https://docs.bdb.ai/data-science-lab-3/project/project-list/keep-multiple-versions-of-a-project#pushing-a-project-to-the-vcs) (only available for an activated Project)
3. [**Pull from VCS**](https://docs.bdb.ai/data-science-lab-3/project/project-list/keep-multiple-versions-of-a-project#pulling-a-project-from-the-vcs) (only available for an activated Project)
4. [**Share**  ](https://docs.bdb.ai/data-science-lab-3/project/project-list/sharing-a-project)
5. [**Edit**](https://docs.bdb.ai/data-science-lab-3/project/project-list/editing-a-project)
6. [**Activate Project**](https://docs.bdb.ai/data-science-lab-3/project/project-list/activating-a-project)
7. [**Deactivate Project** ](https://docs.bdb.ai/data-science-lab-3/project/project-list/deactivating-a-project)
8. [**Delete**](https://docs.bdb.ai/data-science-lab-3/project/project-list/deleting-a-project)

*<mark style="color:green;">**Please Note:**</mark> Refer the* [***Project List***](https://docs.bdb.ai/data-science-lab-3/project/project-list) *page to understand all the above listed options in details.*

</details>

<details>

<summary>Supported Environment</summary>

The following environments are supported inside a ***Data Science Lab*** Project.

* **TensorFlow**: Users can execute Sklearn commands by default in the notebook. If the users select the TensorFlow environment, they do not need to install packages like the TensorFlow and Keras explicitly in the notebook. These packages can simply be imported inside the notebook.
* **PyTorch**: If the users select the PyTorch environment, they do not need to install packages like the Torch and Torchvision explicitly in the notebook. These packages can simply be imported inside the notebook.
* **PySpark**: If the users select the PySpark environment, they do not need to install packages like the PySpark explicitly in the notebook. These packages can simply be imported inside the notebook.

![](https://content.gitbook.com/content/1QeOywZjV1cHo55cMW8u/blobs/resQHFJK6fNABYHgYP14/image.png)

*<mark style="color:green;">**Please Note:**</mark>*&#x20;

* *The **Sklearn** environment is a **default environment** for the **Data Science Lab Project**.*
* *The Project level tabs provided for **TensorFlow** and **PyTorch*** *environments remain same, so the current document presents content for them together.*

</details>

## Dataset&#x20;

Data is the first requirement for any Data Science Project. The user can add the required datasets and view the added datasets under a specific Project by using the ***Dataset tab***.

The user needs to click on the [***Dataset*** ](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset)tab from the ***Project List*** page to access the ***Add Datasets*** option.

<details>

<summary>Adding Data Sets</summary>

The user can get a list of uploaded Data Sets and Data Sandbox from the **Data Center** module under this tab.&#x20;

The **Add Datasets** page offers the following Data service options to add as datasets:

1. [**​Data Sets** ](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/adding-data-sets/data-sets)– These are the uploaded data sets (data services) from the Data Center module.&#x20;
2. **​**[**Data Sandbox**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/adding-data-sets/data-sandbox) – This option lists all the available/ uploaded Data Sandbox.&#x20;

![](https://content.gitbook.com/content/1QeOywZjV1cHo55cMW8u/blobs/dYQL4NJ6XqUTFQ9oXEXm/image.png)

</details>

{% hint style="success" %}
*Checkout the given illustrations to understand the **Adding Dataset (Data Service)** and **Adding Data Sandbox** steps in details.*
{% endhint %}

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FBLGYLEkBUnc8nVEBAuEI%2Fuploads%2F1usFuhxzoqJlC4AGUh99%2FAdding%20a%20Dataset.mp4?alt=media&token=26334851-998d-4b71-9dd7-7196e670dc7b>" %}
***Adding a Data Service as Dataset to DSL Project***
{% endembed %}

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FBLGYLEkBUnc8nVEBAuEI%2Fuploads%2F8nrj8rXKlpDwxPLj9tjf%2FUploading%20a%20Data%20Sandbox%20to%20DSL%20Project.mp4?alt=media&token=12034b66-0011-4ee5-8aa2-efb3b71882c4>" %}
***Uploading a Sandbox file and adding it to a DSL Project***
{% endembed %}

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*&#x20;

1. *The user can add Datasets by using the **Dataset** tab or **Notebook** page.*&#x20;
2. *Based on the selected Environment the supported Data Sets types can be added to a **Project** or **Notebook**. E.g., PySpark environment <mark style="color:orange;">does not support</mark> the **Data Service as Dataset**.*
3. *Refer the **Adding Data Sets** section with the sub-pages to understand it in details.*
4. *Refer the*[ ***Data Preparation***](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/dataset-list-page/data-preparation) *page to understand how the user can apply required Data Preparation steps on a specific dataset from the Data Set List page.*&#x20;
   {% endhint %}

<details>

<summary>Data Set List </summary>

All the uploaded and added datasets get various Actions that can help the users to create more accurate Data Science Experiments. The following major Actions are provided to an added Data Set.

* [**Preview**](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/dataset-list-page/preview) - Opens preview of the selected dataset.
* [**Data Profile**](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/dataset-list-page/data-profile) - Displays the detailed profile of data to know about data quality, structure and consistency.
* [**Create Experiment**](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/dataset-list-page/create-experiment) - Creates an Auto ML experiment on the selected Dataset.
* [**Data Preparation**](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/dataset-list-page/data-preparation) - Cleans data to enhance quality and accuracy that directly impacts reliability of the results.
* [**Delete**](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/dataset-list-page/delete) - Deletes the selected Dataset.

*<mark style="color:green;">**Please Note:**</mark>* *The user can **click each of the above-given Action option** to open the information in details.*

</details>

## Data Science Experiment

Once the user creates a Project and adds the required Data sets to the Project, it gets ready to hold a Data Science Experiment. The Data Science Lab user gets the following ways to go ahead with their Data Science Experiments:

Use ***Notebook*** infrastructure provided under the Project to create script, save as a model or script, load, and predict a model. It is also possible to save the Artifacts for a Saved Model. Refer the [***Notebook***](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook) section for more details.

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FBLGYLEkBUnc8nVEBAuEI%2Fuploads%2Fxpbf56WLPpzPjrfmjQHz%2FCreating%20a%20Notebook%20V3.mp4?alt=media&token=ed19a8eb-738e-417a-9a00-069fdb0de2f5>" %}
***Creating a Notebook***
{% endembed %}

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FBLGYLEkBUnc8nVEBAuEI%2Fuploads%2FyUkza0LVzSakqYxbN1kx%2FImport%20notebook%20V2.mp4?alt=media&token=0cbd67df-4b24-4a03-a905-d9103334480c>" %}
***Importing a Notebook***
{% endembed %}

<details>

<summary>Data Science Model </summary>

The the ***Notebook Operations*** section of the current documentation provides the details on the above stated aspects of the Data Science Models.

1. [**Save a Data Science Model**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/models#saving-a-dsl-model)
2. [**Load a Data Science Model**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/models#loading-a-dsl-model)
3. [**Save Artifacts for a saved Data Science Model**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/artifacts#saving-artifacts)

[*<mark style="color:green;">**Please Note:**</mark>*](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/artifacts#saving-artifacts)

* ***Algorithms** and **Transforms**  are also available* *for the **Data Science models** inside the **Notebook Page** as **Notebook Operations*****.**&#x20;
* *The **Notebook Page** may contain a customized Notebook operations list based on the selected environment. E.g., The Data Science Projects created using the **PySpark environment** contain the following **Notebook Operations**:*
  * [**Datasets**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/datasets)
  * [**Secrets**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/secrets)
  * [**Variable Explorer**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/variable-explorer)
  * [**Writers**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/writers)
  * [**Find and Replace**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/find-and-replace)
* Refer the Environment specific ***Notebook Operations*** by using the following options:
  * [***Notebook Operations for the TensorFlow & PyTorch***](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations)
  * [***Notebook Operations for the PySpark***](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-pyspark-environment/notebook/notebook-page/notebook-operations)

</details>

<details>

<summary>Notebook List Page</summary>

The Notebook List page lists all the created and saved Notebooks inside one Data Science Project. The user gets to apply the following Actions on a Notebook from the Notebook List page:

* [**Export to Pipeline**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-list/notebook-list-actions/export/export-to-p)
* [**Export to GIT**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-list/notebook-list-actions/export/export-to-git)
* [**Register as Job**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-list/notebook-list-actions/register-as-job)
* [**Notebook Version Control**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-list/notebook-list-actions/notebook-version-control)
* [**Sharing a Notebook**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-list/notebook-list-actions/sharing-a-notebook)
* [**Deleting a Notebook**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-list/notebook-list-actions/deleting-a-notebook)

</details>

Use the ***Auto ML*** functionality to get the auto-trained Data Science models. The user can use the [***Create Experiment***](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/dataset-list-page/create-experiment) option provided under the ***Dataset List Page*** to begin with the AutoML model creation. Refer the [***AutoML*** ](https://docs.bdb.ai/data-science-lab-4/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml)section of this documentation for more details.

<details>

<summary>Data Science Experiment</summary>

* Access the Create Experiment option under the Dataset List for the Projects that are created under the supported environments such as ***PyTorch*** and ***TensorFlow.***
* Once the ***AutoML*** experiment gets created successfully, the user gets directed to the ***AutoML*** ***List***.
* The following ***Actions*** are provided on the ***AutoML List*** page:
  * [**View Report**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/view-report)
    * **Details**
    * **Models** - This option provides the detailed model explanation.&#x20;
  * [**Delete**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/delete)

</details>

The View Explanation option is provided for both manually created Data Science models and AutoML generated models.

<details>

<summary>Model Explainability</summary>

The ***Model Explainer*** dashboard for a ***Data Science Lab*** ***model***.

The [**View Explanation**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/view-report/models/view-explanation) option carries the following flow to Explain a Model.

* [**Model Summary** ](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/view-report/models/view-explanation/model-summary)**-** The Model Summary/ Run Summary will display the basic information about the trained top model.
* [**Model Interpretation** ](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/view-report/models/view-explanation/model-interpretation)**-** This option provides the Model Explainer dashboards for an AutoML model.
  * [**Classification Model Explainer**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/view-report/models/view-explanation/model-interpretation/classification-model-explainer) **-** This page provides the explainer dashboards for Classification Models.
  * [**Regression Model Explainer**](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/view-report/models/view-explanation/model-interpretation/regression-model-explainer) **-**&#x54;his page provides the explainer dashboards for Regression Models.
  * [**Forecasting Model Explainer** ](https://docs.bdb.ai/data-science-lab-3/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/view-report/models/view-explanation/model-interpretation/forecasting-model-explainer)- This page provides model explainer dashboards for Forecasting Models.
* [**Dataset Explainer**](https://docs.bdb.ai/data-science-lab/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/auto-ml/automl-list-page/view-report/models/view-explanation/dataset-explainer) **-** The Dataset Explainer tab provides high-level preview of the dataset that has been used for the experiment

</details>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*&#x20;

* *The **Auto ML** functionality <mark style="color:orange;">is not supported</mark> at present for the **Project** created in **within PySpark environment**.*&#x20;
* ***Model As API** functionality <mark style="color:orange;">is not available</mark> for the **AutoML** Models.*
* ***Model Explainability** and **Model As API** functionalities <mark style="color:orange;">are not available</mark> for the **Imported models**.*
  {% endhint %}

<details>

<summary>Repo Sync Project</summary>

Refer the below-given page links to get directed to the various functionalities provided under the Repo Sync Project:

* [Creating a Repo Sync Project](https://docs.bdb.ai/data-science-lab-4/repo-sync-project/creating-a-repo-sync-project)
* [Repo Sync Project List](https://docs.bdb.ai/data-science-lab-4/repo-sync-project/project-list)
* [Repo Sync Project in the Python Environment](https://docs.bdb.ai/data-science-lab-4/repo-sync-project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment) (TensorFlow or PyTorch Environment)
* [Repo Sync Project in the PySpark Environment ](https://docs.bdb.ai/data-science-lab-4/repo-sync-project/tabs-for-a-data-science-lab-project/tabs-for-pyspark-environment)

</details>
