Workflow 2

To create forecasting models by leveraging the inbuilt algorithms, utilities, and artifact features of DS Lab.

In Workflow 2, we demonstrate the power and simplicity of creating forecasting models using the in-built algorithms and boilerplate code features of DS Lab. We also highlight the utility and artifact features that help streamline and optimize model development.

This workflow covers the following key steps:

· DS Lab Project Creation

· Upload a Utility Script File

· Create a Forecasting Model using the N-BEATS algorithm and save the preprocessed DataFrame as an Artifact for further experimentation

· Use Artifact Data to create another forecasting model using the Random Forest Algorithm

Create a Project in DS Lab

  1. Navigate to DS Lab from the Apps menu.

  2. Clicks on Create+ button.

  3. Enter a project name: DS LAB WORKFLOW 2 and description.

  4. Select the algorithm type: Forecasting from the drop down. (Regression, Classification, Forecasting, NLP, etc.).

  5. Choose a Environment:

  6. PythonTensorFlow(Used in this workflow)

  7. PyTorch

  8. PySpark

  9. Allocate resources according to dataset size:Low.

  10. Set an Idle Shutdown limit: 30 min.

  11. Add external libraries if required

  12. Save the project.

To Begin,

  1. Activate the project by clicking on the Activate option located on the right.

  2. Open the project by clicking on the “view” which is situated to the right after you click on the activated project .

  3. Wait for the kernel to start.

  4. Navigate to the utils tab in the project interface.

  5. Click on 3 dots of the utils tab and select the import option.

  6. Give utility a name: utility_func and choose the utility file from the file from the local system.

  7. Once done, click on save.

Create Forecasting Model.

Create a Notebook and Upload a Dataset

· Activate the project by clicking on the Activate option located on the right.

· Open the project by clicking on the “view” which is situated to the right after you click on the activated project .

· Wait for the kernel to start.

· Click on “create” to create a notebook, give notebook an appropriate name: DS Lab workflow 2 and description and then click on “save”.

· Click on the “Data” icon located to the left the of the workspace to add the sandbox as data.

· After clicking on the data icon, click on the ‘+’ icon located to right of the search bar.

· An “add data” page will pop up, select the “data sandbox files” option from the data source dropdown menu.

· Click on the “upload” option located to the right of the “add data” page.

· Give sandbox a name and an appropriate description and choose the sandbox file from your local system i.e. “Forecasting data”

· Click on save to upload the file successfully, Once the file is uploaded a pop will appear”File is uploaded”

· Now you can check the check box of the newly uploaded sandbox and Click on “add” to add the sandbox as data.

· Once the sandbox is added, click on the cell and then click on the check box of the sandbox that you see under the data section.

· Automatically a code will appear in the cell, this appeared code will help you to get the data of the sandbox file you uploaded.

· Run the cell using the “run cell” icon present on the top left corner of the cell.

· After Running the 1st cell of the notebook, write a code to import the utility file that you have already uploaded in the utils section from the local system.

· After your code is written Run the cell, this will give you the utility data for the project.

· Once the code runs successfully, we will start building our model.

· Navigate to the algorithm section on the left side(same as where the data section is located).

· Click on the algorithm section and select the forecasting option.

· Choose the N-Beats option from the drop down menu(don’t forget to click on the cell “where you want to generate the code” before choose the N-beat option.)

· Now the code will be generated automatically so that the model can be built in a very efficient manner.

· Run the cell, once the necessary change are made in the code such as: _data= ’df’

_time= ‘date’

_target=’sale’

· The results will be plotted using the Matplotlib library, showing the actual and predicted chart.

· Use "modelname.predict" to predict results for different scenarios.

· Now add a cell below the import utility cell.

· Click on the 3 dots of the cell, and click “save artifact” option.

· A code will be generated automatically.

· Provide a name for the dataframe and a file name with the .csv extension to store the artifact successfully.

· After configuring the code, run the cell

· Now navigate to the artifacts section (located at the same place as in the data section).

· You will see that the Artifacts are successfully created.

Create Forecasting Model using Artifact data as input

Create a Notebook and Upload a Dataset

· Click on “create” to create a notebook, give notebook an appropriate name: Test artifacts and random forest algorithm and description, then click on “save”.

· Click on the “Data” icon located to the left the of the workspace to add the sandbox as data.

· After clicking on the data icon, click on the ‘+’ icon located to right of the search bar.

· An “add data” page will pop up, select the “data sandbox files” option from the data source dropdown menu.

· Click on the “upload” option located to the right of the “add data” page.

· Give sandbox a name and an appropriate description and choose the sandbox file from your local system.

· Click on save to upload the file successfully, Once the file is uploaded a pop will appear”File is uploaded”

· Now you can check the check box of the newly uploaded sandbox and Click on “add” to add the sandbox as data.

· Once the sandbox is added, click on the cell and then click on the check box of the sandbox that you see under the data section.

· Automatically a code will appear in the cell, this appeared code will help you to get the data of the sandbox file you uploaded.

· Run the cell using the “run cell” icon present on the top left corner of the cell.

Once the code runs successfully, we will start building our model.

· Navigate to the algorithm section on the left side (same as where the data section is located).

· Click on the algorithm section and select the forecasting option.

· Choose the Random Forest option from the drop-down menu (don’t forget to click on the cell “where you want to generate the code” before choose the Random forest option.)

· Now the code will be generated automatically so that the model can be built in a very efficient manner.

· Run the cell, once the necessary change are made in the code such as: _data= ‘df_preprocess_dataframe’

_time= ‘date’

_target=’sale’

Once the code runs successfully, you will get a random forest model capable of doing the forecasting

Once done, deactivate the project to avoid unnecessary resource wastage.

This completes your workflow 2 of Data science lab module.

Last updated