Data
The Data option allows users to add data to their project from the Data Science Notebook infrastructure.
The Data tab in a Data Science Lab (DSLab) Notebook allows users to add datasets, data sandbox files, or feature store entries, and access them within notebook code cells. This enables seamless integration of data for analysis, modeling, and experimentation.
Steps to Add Data to a Notebook
Navigation path: Data Science Lab > Project > Workspace > Left-side Panel > Data
Open a notebook in the Workspace.
Click the Data icon in the left-side panel of the notebook.
Click the Add icon provided next to Data.
The Add Data drawer will open, displaying the related icons and options.
Click the Add icon.
The Add Data drawer will open, displaying the Data Source, Search Data Connector Type, and Search Data Connector fields with drop-down menus to get a specific selection.
Follow the prompts to add a dataset. The available steps may differ based on the data source (e.g., uploaded files, sandbox files, or feature store).
Adding Data to a Data Science Lab Project
Data Science Lab (DSLab) provides multiple options to add data to your projects, including:
Datasets from the Data Center module
Data Sandbox files from local or configured sandbox environments
Feature Stores for machine learning workflows
Adding data allows notebooks to access structured data, perform analysis, and train models within the project.
Pre-requisites
Users must have permission to access the Data Center module of the platform.
Users must have the required datasets listed under the Data Center module.
For Data Sandbox files, Sandbox Settings must be configured to enable access within DSLab.
Adding Data Sets
Navigation path: Data Science Lab > Project > Workspace > Data > Add Data > Data Set (as Data Source)
Open the DSL Project.
Click the Data tab.
Click Add Data to open the Add Data page.
Select Data Sets from the Data Source drop-down menu (default selection is Data Service).
Use the Search bar to locate the required datasets.
Select the dataset(s) using the checkboxes next to the data entries.
Click Add.
The selected dataset(s) are added to the project.
A notification message confirms that the dataset(s) have been successfully added.
Uploading and Adding Data Sandbox Files
Navigation path: Data Science Lab > Project > Workspace > Data > Add Data > Data Sandbox (as Data Source)
Uploading a Data Sandbox File
Open the DSL Project.
Click the Data tab.
Click Add Data.
Select Data Sandbox from the Data Source drop-down menu.
Click Upload to open the Upload Data Sandbox page.
Provide the following information:
Sandbox Name: Name of the sandbox file.
Description: Optional description.
Click Choose File and select a file from your system (supported formats: CSV, XLSX).
Click Save to begin the upload.
Wait for the file to upload to 100%.
A notification confirms that the file has been successfully uploaded.
Adding Uploaded Data Sandbox Files to a Project
Return to the Add Data page.
Select Data Sandbox from the Data Source drop-down menu.
Use the Search bar to locate the uploaded sandbox file.
Select the sandbox file using the checkbox next to it.
Click Add.
The added sandbox file is now listed under the Datasets tab.
A notification message confirms that the sandbox file has been successfully added.
3. Adding Feature Stores
Navigation path: Data Science Lab > Project > Data Science Lab > Project > Workspace > Data > Add Data > Feature Stores (as Data Source)
Open the DSL Project.
Click the Data tab.
Click Add Data to open the Add Data page.
Select Feature Stores from the Data Source drop-down menu.
Use the Search bar to locate the required feature store(s).
Select the feature store(s) using the checkboxes next to the entries.
Click Add.
The selected feature store(s) are added to the project.
A notification message confirms that the feature store(s) have been successfully added.
Notes
The Search bar is available for all data types (Datasets, Data Sandbox files, Feature Stores) to quickly locate the required entries.
Only users with appropriate permissions can access Data Center datasets or upload sandbox files.
Each added data source is immediately accessible within notebooks for analysis, model training, or transformations.
Steps to Read Added Data
Add a new code cell in the notebook, or select an existing empty code cell.
From the Data tab, select the dataset you want to read.
The
get_data
function automatically appears in the code cell.Assign the dataset to a variable (commonly
df
) to access its contents.df = get_data("Dataset_Name") print(df)
Click the Run cell icon to execute the code.
The data preview will be displayed below the code cell once execution is complete.