Data Science Lab
  • What is Data Science Lab?
  • Accessing the Data Science Lab Module
  • Data Science Lab Quick Start Flow
  • Project
    • Environments
    • Creating a Project
    • Project List
      • View
      • Keep Multiple Versions of a Project
      • Sharing a Project
      • Editing a Project
      • Activating a Project
      • Deactivating a Project
      • Deleting a Project
    • Tabs for a Data Science Lab Project
      • Tabs for TensorFlow and PyTorch Environment
        • Notebook
          • Ways to Access Notebook
            • Create
            • Import
              • Importing a Notebook
              • Pull from Git
          • Notebook Page
            • Preview Notebook
            • Notebook Cells
              • Using a Code Cell
              • Using a Markdown Cell
              • Using an Assist Cell
            • Renaming a Notebook
            • Resource Utilization Graph
            • Notebook Taskbar
            • Notebook Operations
              • Datasets
                • Copy Path (for Sandbox files)
              • Secrets
              • Algorithms
              • Transforms
              • Utility
              • Models
                • Model Explainer
                • Registering & Unregistering a Model
                • Model Filter
              • Artifacts
              • Files
              • Variable Explorer
              • Writers
              • Find and Replace
            • Notebook Actions
          • Notebook List
            • Notebook List Actions
              • Export
                • Export to Pipeline
                • Export to GIT
              • Register as Job
              • Notebook Version Control
              • Sharing a Notebook
              • Deleting a Notebook
        • Dataset
          • Adding Data Sets
            • Data Sets
            • Data Sandbox
          • Dataset List Page
            • Preview
            • Data Profile
            • Create Experiment
            • Data Preparation
            • Delete
        • Utility
          • Pull from Git (Utility)
        • Model
          • Model Explainer
          • Import Model
          • Export to GIT
          • Register a Model
          • Unregister A Model
          • Register a Model as an API Service
            • Register a Model as an API
            • Register an API Client
            • Pass Model Values in Postman
          • AutoML Models
        • Auto ML
          • Creating AutoML Experiments
            • Creating an Experiment
          • AutoML List Page
            • View Report
              • Details
              • Models
                • View Explanation
                  • Model Summary
                  • Model Interpretation
                    • Classification Model Explainer
                    • Regression Model Explainer
                    • Forecasting Model Explainer
                  • Dataset Explainer
            • Delete
      • Tabs for PySpark Environment
        • Notebook
          • Ways to Access Notebook
            • Create
            • Import
              • Importing a Notebook
          • Notebook Page
            • Preview Notebook
            • Notebook Cells
              • Using a Code Cell
              • Using a Markdown Cell
              • Using an Assist Cell
            • Renaming a Notebook
            • Resource Utilization Graph
            • Notebook Taskbar
            • Notebook Operations
              • Datasets
                • Copy Path (for Sandbox files)
              • Secrets
              • Utility
              • Files
              • Variable Explorer
              • Writers
              • Find and Replace
            • Notebook Actions
          • Notebook List
            • Notebook List Actions
              • Export
                • Export to Pipeline
                • Export to GIT (on hold)
              • Register as Job
              • Notebook Version Control
              • Sharing a Notebook
              • Deleting a Notebook
        • Dataset
          • Adding Data Sets
            • Data Sets
            • Data Sandbox
          • Dataset List Page
            • Preview
            • Data Profile
            • Data Preparation
            • Delete
        • Utility
Powered by GitBook
On this page

What is Data Science Lab?

The BDB Data Science Lab serves as a collaborative hub for data scientists to work together. Within this module, they can collectively conduct experiments, exchange Notebooks, models, and other important elements with their team. This collaborative environment allows for validation and seamless deployment of these resources to the Production environment.

What is a Data Science Project?

A Data Science Project created inside the Data Science Lab is like a Workspace inside which the user can create and store multiple data science experiments and their associated artifacts.

Please Note: The user can create a new Notebook for coding or upload an existing Notebook only after Activating a Data Science Project.

What is a Notebook/ Data Science Notebook?

A Data Science Notebook is an interactive and collaborative digital platform used by data scientists and analysts for data exploration, analysis, modeling, and visualization. It combines executable code, visualizations, and explanatory text in a flexible and shareable format, making it a versatile tool for data science projects. Key features include code execution, rich text and visualizations, interactive data exploration, collaboration and sharing, reproducibility and documentation, and integration with data science libraries and tools.

What is Data Set in context to Data Science Lab module?

A dataset in data science is a structured collection of data used for analysis and modeling. It represents a specific domain or problem and can include various data types. Datasets are essential for tasks like data analysis, modeling, and extracting insights in both supervised and unsupervised learning. They can be sourced from different domains and collected from surveys, experiments, or existing databases. Datasets contain features and labels for supervised learning, while they are unlabeled for unsupervised learning. They are typically split into training and test sets. Publicly available datasets are widely used for research and benchmarking. Datasets form the foundation for various data science tasks and enable solving complex problems.

The Data Set tab provided under the Data Science Lab module supports the following types of Data sets:

  1. Dataset - Here, Dataset stands for a table or filtered data from database.

  2. Data Sandbox - Data Sandbox are files that are uploaded or appended to Data Sandbox folder from local directory (excel, csv, text etc.).

What is a Data Science Model?

A data science model refers to a mathematical or computational representation of a real-world phenomenon or problem that data scientists use to make predictions, gain insights, or automate decision-making processes. It is a key component of the data science workflow and is built using data, statistical techniques, and algorithms.

The Model tab under a Data Science Project includes:

  1. Imported Models: Models trained using external tools and libraries, which are brought into the data science workflow for analysis or prediction tasks.

  2. Models created in Data Science Lab Notebook: Models built and trained within the Data Science Lab Notebook environment, utilizing its features and capabilities.

  3. AutoML Models: Models generated through automated machine learning (AutoML) techniques, which automatically search and select the best model based on the given data and desired outcome.

What is Utility script?

The Utility tab allows to create and list the python scripts (.py files) that can be imported to your notebook.

What is AutoML?

AutoML (Automated Machine Learning) refers to the automated process of building and optimizing machine learning models without extensive manual intervention. It leverages intelligent algorithms and techniques to automate tasks such as data preprocessing, feature selection, model selection, hyperparameter tuning, and model evaluation. AutoML aims to simplify and accelerate the model development process, enabling users with limited machine learning expertise to create effective models efficiently.

The Auto ML tab allows the users to create data science experiments and lists them.

Please Note: Python upgraded to 3.10.0 for the entire DS Lab module.

NextAccessing the Data Science Lab Module

Last updated 1 year ago