# Start your Data Science Experiment with Data Science Lab

## ​[Create your Data Science Project ](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/creating-a-project) <a href="#create-your-project" id="create-your-project"></a>

* Navigate to the Projects Page of the Data Science Lab plugin.
* Click the ***Create Project*** option to create a new project.
* A form opens to provide the Project-related information.
* The next screen opens asking for the following details for a new project:
  * ***Project Name***: Give a name to the new project.
  * ***Project Description:*** Describe the project.
  * ***Select Algorithms***: Select Algorithms using the drop-down menu.
  * ***Environment***: Allows users to select the environment they want to work in. Currently, the supported Python frameworks are ***PySpark,*** ***TensorFlow,*** and ***PyTorch**.*

    * If the users select the TensorFlow environment, they do not need to install packages like the ***TensorFlow*** and ***Keras*** explicitly in the notebook. These packages can simply be imported inside the notebook.
    * If the users select the ***PyTorch*** environment, they do not need to install packages like the ***Torch*** and ***Torchvision*** explicitly in the notebook. These packages can simply be imported inside the notebook.

    The user can select an option out of the given choices: 1. Python Tensor Flow, 2. Python PyTorch.
  * ***Resource Allocation***: This allows the users to allocate CPU/ GPU and memory to be used by the Notebook container inside a given project. The currently supported Resource Allocation options are Low, Medium, and High.
  * ***Idle Shutdown***: Idle Shutdown: It allows the users to specify the idle time limit after which the notebook session will get disconnected, and the project will be deactivated. To use the notebook again, the project should be activated. The supported Idle Shutdown options are 30m, 1h, and 2h.
  * External Libraries: Provide the external libraries’ links required for the project.
* Based on the selection of the Resource Allocation field the following fields appear with pre-selected values:
  * Image Name
  * Image Version
  * Limit
  * Memory
  * Request (CPU)
  * Memory
* Select the ***nvidia*** from the ***GPU Type*** field to improve the performance of the project.&#x20;
* Click the ***Save*** option.
* The newly created project gets saved, and it appears on the screen.
* The success of project creation is informed by a notification message.

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*

* *The user can also open the Project list by clicking the **View Project** option.*
  * *Click the **View Project** optio&#x6E;**.***
  * *The user gets redirected to the **Project list**.*
* *A project gets the **Share**, **Edit**, **Delete**, **Activate**/**Deactivate** actions to be applied on it after getting listed under the Project list.*
* *A Data Science Project gets related tabs based on the selection of the environment.*&#x20;
  * ***PySpark** environment currently supports only **Notebook**, **Dataset**, and **Utility** tabs.*
  * ***PyTorch** & **TensorFlow** environments support **Notebook**, **Dataset**, **Model,*** ***Utility**, and **AutoML** tabs.*&#x20;
    {% endhint %}

{% hint style="info" %}
*<mark style="color:green;">Pre-requisite:</mark> The Data Science Lab projects also get **Push to VCS** and **Pull from VCS** functionalities, but they only get enabled for the activated DSL projects.*
{% endhint %}

## ​[Activate your Project](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/project-list/activating-a-project)  <a href="#activate-your-project" id="activate-your-project"></a>

* Navigate to the ***Projects*** page.
* Select a project from the list.
* Click the ***Activate*** option.
* A dialog window appears to confirm the Activation.
* Click the ***Yes*** option.​
* The project gets activated and a notification message appears to communicate the completion of the action.
* The ***Activation*** option gets changed into the [***Deactivation*** ](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/project-list/deactivating-a-project)option for the concerned project.

## ​[Create your first Notebook](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/ways-to-access-notebook/creating-a-notebook)  <a href="#create-your-first-notebook" id="create-your-first-notebook"></a>

* Click the **Create Notebook** option from the Notebook tab.
* A new Notebook gets created; the user gets a notification message informing the same.
* Click the **Back** icon.
* The Notebook gets saved under the Notebook list.

Please Note: The following&#x20;

1. *Edit the Notebook name by using the **Edit Notebook Name** icon.*
2. *The accessible datasets, models, and artifacts will list down under the **Datasets**, **Models**, and **Artifacts** menus.*
3. ***Find/Replace** menu facilitates the user to find and replace a specific text in the notebook code.*
4. *Add a description for the created Notebook by using the same page.*

## ​[Upload your Notebook to DS Lab​](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/ways-to-access-notebook/uploading-a-notebook) <a href="#upload-your-notebook-to-ds-lab" id="upload-your-notebook-to-ds-lab"></a>

The users can seamlessly upload Notebooks created using other tools and saved in their systems.

* Navigate to the landing page of an activated Project.
* Click the ***Upload Notebook*** option.
* Specify a Notebook from the system.
* Click the ***Open*** option to upload the Notebook.
* The selected Notebook gets uploaded under the Project.
* The same gets confirmed by a notification message.
* Another notification message appears to inform the status of the Notebook (it gets saved by default).
* Click the ***Back*** icon.
* The uploaded Notebook gets listed on the landing page of the Project.

## ​[Create your first Data Science Lab Model ​](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/models) <a href="#create-your-first-dsl-model" id="create-your-first-dsl-model"></a>

Once the Notebook script is executed successfully, the users can save them as a model. The saved model can be loaded into the Notebook.

### Save your Data Science Lab Model <a href="#save-your-dsl-model" id="save-your-dsl-model"></a>

* Navigate to a Notebook.
* Write code using the following sequence:
  * ***Read DataFrame***
  * ***Define test and train data***
  * ***Create a model***
* Execute the script.
* Get a new cell.
* Give a model name to specify the model.
* Execute the cell.
* After the code gets executed, click the ***Save Model*** notebook in a new cell.
* The saved model gets listed under the ***Models*** list.

### Load your Data Science Lab Model <a href="#load-your-dsl-model" id="load-your-dsl-model"></a>

* Click on a new cell and select the model by using the given checkbox to load it.
* The model gets loaded into a new cell.

## ​[Predict the Model Output ](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/predict) <a href="#predict-the-model-output" id="predict-the-model-output"></a>

*Check out the walk-through on the **Predict** option for a DSL Notebook.*

The data scientist can get the predicted array from a loaded DSL model that contains a definite DataFrame.

* Add a new cell.
* Click the ***Predict*** option.
* Execute the code.
* Provide the model and DataFrame.
* The predicted output of the given DataFrame appears as an array.
* The default comments on how to define the predicted output for a DS Lab model appears as well.

## ​[Save Artifacts of your model​ ](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-page/notebook-operations/artifacts) <a href="#save-artifacts-of-your-model" id="save-artifacts-of-your-model"></a>

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>*&#x20;

* *The user can save the artifacts of a predicted model using this option.*
* *The saved **Artifacts** can be downloaded as well.*

{% endhint %}

* Add a new cell.
* Click the ***Save Artifacts***.
* Give proper DataFrame name and Name of Artifacts (with extensions - .csv/.txt/.json).
* Execute the cell.
* The ***Artifacts*** get saved.

## ​[Deploy/ Register your DS Model to Pipeline ​](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/model/register-a-model) <a href="#deploy-your-ds-model-to-pipeline" id="deploy-your-ds-model-to-pipeline"></a>

The Data Scientist can deploy a saved DSL model to the Data Pipeline plugin by using the ***Model*** tab.

* Navigate to the ***Model*** tab.
* Select a model from the list.
* Click the ***Deploy to Pipeline*** icon for the model.
* The ***Deploy to Pipeline*** dialog box appears to confirm the action.
* Click the ***Yes*** option.​
* The selected model gets published and deployed to the Data Pipeline (It disappears from the ***Unpublished*** model list).
* A notification message appears to inform the same.

1. *The published/deployed model gets listed under the **Published** filter.*
2. *The **Publish** option provided under the **Notebook tab** and the **Deploy to Pipeline** option provided under the **Model tab** perform the same task.*

## ​[Register your Data Science Model as an API ](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/model/register-a-model-as-an-api-service) <a href="#publish-your-ds-model-as-an-api" id="publish-your-ds-model-as-an-api"></a>

This function gets completed in three steps:1. Publish a Model as an API2. Register an API Client3. Pass the Model values in the Postman. *Check-out the below given video to understand the **Publish Model as an API Service** functionality.*

### **​Publish a Model as an API​**

You can publish a DSL model as an API using the Model tab. Only the published models get this option.

* Navigate to the ***Model*** tab.
* Filter the model list by using the ***Published*** filter option.
* Select a model from the list.
* Click the ***Publish as API*** option.
* The ***Update model*** page opens.
* Provide Max instance limit.
* Click the ***Save and Publish*** option.

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark>* *Use the **Save** option to save the data which can be published later.*
{% endhint %}

* The model gets saved and published as an API service. A notification message appears to inform the same.

### **​Register an API Client​**

* Navigate to the **Admin** module.​
* Click the ***API Client Registration*** option.
* The API Client Registration page opens.
* Click the ***New*** option.
* Select the Client type as ***internal**.*
* Provide the following client specific information:
  * Client Name
  * Client Email
  * App Name
  * Request Per Hour
  * Request Per Day
  * Select API Type- Select the ***Model as API*** option.
  * Select the Services Entitled -Select the published DSL model from the drop-down menu.
* Click the ***Save*** option.
* The client details get registered.
* A notification message appears to inform the same.

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> Once the client gets registered open the registered client details using the **Edit** option to get the **Client id** and **Client Secrete key**.*
{% endhint %}

### **​Pass the Model values in the Postman​**

* Navigate to the Postman.
* Add a new ***POST*** request.
* Pass the URL with model name (Only the Sklearn models are supported at present).
* Provide required parameters under the ***Params*** tab:a. Client Idb. Client Secret Keyc. App Name
* Open the ***Body*** tab.
* Select the ***raw*** option.
* Provide the input DataFrame.
* Click the ***Send*** option.
* The response will appear below.
* You can save the response by using the ***Save Response*** option.

{% hint style="info" %}
*<mark style="color:green;">Please Note</mark><mark style="color:green;">:</mark>* *The model published as an API service can be easily consumed under various apps.*
{% endhint %}

## ​[Adding Datasets to a Project](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/adding-data-sets) <a href="#add-data-sets-to-your-project" id="add-data-sets-to-your-project"></a>

The Dataset tab offers a list of uploaded Datasets which can be added to a project. The user can get a list of uploaded Data Sets and Data Sandbox from the Data Center module under this tab.The ***Add Datasets*** page offers the following Data service options to add as datasets:

1. ​[**Data Service**](https://docs.bdb.ai/7.6/data-science-lab/project/various-tabs-to-work-with/dataset/adding-data-sets/data-sets) – These are the uploaded data sets from the Data Center module.
2. **​**[**Data Sandbox**](https://docs.bdb.ai/7.6/data-science-lab/start-your-data-science-experiment-with-ds-lab) – This option lists all the available Data Sandbox from the Data Center module.

## **​**[**Adding Data Service​**](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-pyspark-environment/notebook/notebook-page/notebook-operations/datasets)

* Navigate to a Project-specific page and click the ***Datase*****t** tab (E.g., the given image displays the Dataset tab under the ***Sample Project**)*.
* Click the ***Add Datasets*** button.
* The ***Add Datasets*** page opens offering two options to choose data:
  * Data service (gets selected by default)
  * Data Sandbox
* Use the ***Search space*** to search through the displayed data service list.
* Select the required data service(s) using the checkboxes provided next to it.
* Click the ***Add*** option.
* The selected data service(s) gets added to the concerned project.
* A notification message appears to inform the same.

## **​**[**Adding Data Sandbox​**](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/dataset/adding-data-sets/data-sandbox)

* Open the ***Dataset*** tab from a specific project.
* Click the ***Add Datasets*** option.
* You get redirected to the ***Add Datasets*** page.
* Select the Data Sandbox option from the Data Service drop-down menu.
* Use the ***Search*** space to search for a specific Data Sandbox.
* Select the required Data Sandbox(es) using the checkboxes provided next to it.
* Click the ***Add*** option.
* The selected data sandbox(es) gets added to the concerned project.
* A notification message appears to inform the same.

## ​[VCS for Projects ](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/project-list/keep-multiple-versions-of-a-project) <a href="#vcs-for-projects" id="vcs-for-projects"></a>

{% hint style="info" %}
*Pre-requisite: Make sure that the Version control settings for the DSL plugin are configured by your administrator before you use this functionality.*
{% endhint %}

### **Pushing a Project to the VCS**

* Navigate to the ***Projects*** page of the DS Lab plugin.
* Select an ***activated project*****.**
* Click the ***Push into VCS*** icon for the project.
* The ***Push into Version Controlling System*** dialog box appears.
* Provide a ***Commit*** Message.
* Click the ***Push*** option.
* The DSL Project version gets pushed into the Version Controlling System, a notification message appears to inform the same.

### **Pulling a Project from the VCS**

* Navigate to the ***Projects*** page of the DS Lab plugin.
* Select an ***activated project***.
* Click the ***Pull from VCS*** icon for the project.
* The ***Pull from Version Controlling Syst*****em** dialog box opens.
* Select the version that you wish to pull by using the checkbox.
* Click the ***Pull*** option.
* The pulled version of the selected Project gets updated in the Project list.
* A notification message appears to inform the same.

{% hint style="info" %}
*<mark style="color:green;">Please Note:</mark> The Push to and Pull from VCS functionalities will not be enabled for a deactivated project.*
{% endhint %}

## ​[VCS for Notebooks​](https://docs.bdb.ai/data-science-lab/~/changes/lSgdXe3Do34tLrgwmc0c/project/tabs-for-a-data-science-lab-project/tabs-for-tensorflow-and-pytorch-environment/notebook/notebook-list-page/notebook-version-control) <a href="#vcs-for-notebooks" id="vcs-for-notebooks"></a>

{% hint style="info" %}
*<mark style="color:green;">Pre-requisite:</mark> Make sure that the Version control settings for the DSL plugin are configured by your administrator before you use this functionality.*
{% endhint %}

### **Pushing a Notebook to the VCS**

* Navigate to the Notebook list of a Project.
* Select a Notebook.
* Click the ***Push into VCS*** icon for the Notebook.​
* The Push into Version Controlling System dialog box appears.
* Provide a Commit Message.
* Click the ***Push*** option.​
* The Notebook version gets pushed into the Version Controlling System and the Notebook list gets updated with the latest version.
* A notification message appears to inform the success of the action.​

### **Pulling a Notebook from the VCS**

* Navigate to the Notebook list given under a Project.
* Select a Notebook.
* Click the ***Pull from VCS*** icon for the Notebook.
* The ***Pull from Version Controlling System*** dialog box opens.
* Select the version that you wish to pull by using the checkbox.
* Click the ***Pull*** option.
* The pulled version of the selected Notebook gets updated in the Notebook list.
* A notification message appears to inform the success of the action.
