Create Forecasting Models in DS Lab

Workflow 2 in DS Lab demonstrates the power and simplicity of creating and optimizing forecasting models by leveraging in-built algorithms, boilerplate code, and utility/artifact features.

Purpose

This guide explains how to create and compare forecasting models in the BDB DS Lab using the platform’s in-built algorithms, boilerplate code, and artifact utilities. The workflow demonstrates two forecasting approaches—N-BEATS and Random Forest—while leveraging utility scripts and artifacts for efficient experimentation.

Objectives

By completing this workflow, you will learn how to:

  1. Create and configure a DS Lab project.

  2. Upload and use a utility script.

  3. Build a forecasting model using the N-BEATS algorithm.

  4. Save a pre-processed DataFrame as an artifact for reuse.

  5. Build another forecasting model using the Random Forest algorithm based on the artifact data.

Step 1 – Create a New DS Lab Project

  1. From the Apps menu, open DS Lab.

  2. Click Create +.

  3. Enter the following details:

    Field
    Example
    Notes

    Project Name

    DS LAB WORKFLOW 2

    Descriptive and unique

    Description

    “Forecasting Model Workflow”

    Optional

    Algorithm Type

    Forecasting

    (Other options – Regression, Classification, NLP)

    Environment

    Python TensorFlow

    Selected for this workflow

    Resource Allocation

    Low

    Adjust based on dataset size

    Idle Shutdown

    30 minutes

    Auto-terminates inactive sessions

  4. Add external libraries if required (for example, boto3).

  5. Click Save.

  6. Activate the project using the Activate button on the right panel.

  7. Once activated, click View to open it and wait for the kernel to start.

Step 2 – Upload a Utility Script

  1. In the project interface, navigate to the Utils tab.

  2. Click the three-dot menu (⋮) and select Import.

  3. Enter a name — utility_func.

  4. Browse your local system and select the required utility Python file.

  5. Click Save.

Step 3 – Create a Notebook and Upload a Dataset

  1. Ensure the project kernel is active.

  2. Click Create Notebook, provide:

    • Notebook Name: DS Lab Workflow 2

    • Description: “Forecasting Model Development”

  3. Click Save.

  4. Click the Data icon on the left navigation bar.

  5. Click the + Add Data icon on the upper-right.

  6. On the Add Data page:

    • Select Data Sandbox Files as the data source.

    • Click Upload, provide a name (e.g., Forecasting Data), description, and choose the CSV file from your local machine.

  7. After upload, a pop-up confirms “File is uploaded.”

  8. Check the box beside the uploaded sandbox and click Add.

  9. A code snippet automatically appears in the notebook cell to load the dataset.

  10. Click Run Cell to execute.

Step 4 – Import the Utility Script

  1. In a new cell, write Python code to import the uploaded utility file.

  2. Run the cell to verify successful import and to load any custom functions required for preprocessing.

Step 5 – Create a Forecasting Model Using N-BEATS

  1. From the left-hand menu, select the Algorithm tab.

  2. Choose ForecastingN-BEATS.

  3. Ensure the target cell is selected before the algorithm is chosen.

  4. The system will auto-generate boilerplate code for N-BEATS forecasting.

  5. Update the key parameters:

    _data   = 'df'
    _time   = 'date'
    _target = 'sale'
  6. Run the cell to train the model.

  7. Visualize the results:

    • Use matplotlib to display actual vs. predicted sales.

    • Use modelname.predict() for additional forecasting scenarios.

Step 6 – Save the Preprocessed Data as an Artifact

  1. Insert a new cell below the import-utility cell.

  2. Open the cell menu (⋮) and select Save Artifact.

  3. The system generates boilerplate code automatically.

  4. Configure the artifact:

    artifact_df = df_preprocessed
    artifact_name = "forecasting_artifact.csv"
  5. Run the cell to save the artifact.

  6. Open the Artifacts tab on the left to confirm creation.

Step 7 – Create a Forecasting Model Using Artifact Data

7.1 Create a New Notebook

  1. Click Create Notebook, name it Test Artifacts and Random Forest Algorithm.

  2. Provide a brief description and click Save.

7.2 Load Artifact Data

  1. Click the Data icon → + Add Data.

  2. Select Data Sandbox Files as the source.

  3. Upload or reference the same artifact file saved earlier.

  4. Check its box and click Add.

  5. Run the automatically generated code cell to load the artifact dataset.

7.3 Build a Random Forest Forecasting Model

  1. From the Algorithm tab, select Forecasting → Random Forest.

  2. Ensure the notebook cell is selected before generating code.

  3. Modify the generated script as follows:

    _data   = 'df_preprocess_dataframe'
    _time   = 'date'
    _target = 'sale'
  4. Run the cell to train and evaluate the Random Forest forecasting model.

Step 8 – Validate Results and Clean Up

  • Compare prediction accuracy and graphs from both algorithms (N-BEATS vs. Random Forest).

  • Save results or metrics as needed.

  • Deactivate the project to release compute resources.

Outcome

You have successfully:

  • Created a forecasting project in DS Lab.

  • Utilized the N-BEATS algorithm for time-series prediction.

  • Saved a processed DataFrame as an artifact.

  • Reused artifact data to build a Random Forest forecasting model.