Algorithms

Get steps on how to do Algorithm Settings and Project level access to use Algorithms inside Notebook.​

Using Algorithms in Data Science Lab Projects

The Algorithms functionality in Data Science Lab (DSLab) allows users to select pre-defined algorithms at the project level and apply them inside notebooks for model training, prediction, and evaluation. This feature provides a streamlined approach to implement machine learning workflows without writing boilerplate code.

Note: Algorithms selected during project creation are available to all notebooks within that project.

Selecting Algorithms During Project Creation

Navigation path: Data Science Lab > Projects > Create > Algorithms

  1. Open the Create Project page.

  2. From the Algorithms drop-down menu, select the desired algorithm categories using the checkboxes.

  3. Selected algorithm categories appear in the field, separated by commas.

    1. The supported algorithm categories under Data Science Lab are Regression, Classification, Forecasting, Unsupervised Learning, and Natural Language Processing

  4. Complete all other required project fields.

  5. Click Save to create the project.

Using Algorithms Inside a Notebook

Navigation path: Data Science Lab > Workspace > Left-side panel > Algorithms tab > Notebook Code cell > Algorithm sub-category

  1. Open the Workspace tab inside the activated project.

  2. Add a dataset and run it in a notebook.

  3. Click the Algorithms icon from the left-side panel under Workspace.

  4. Add a new code cell in the notebook.

  5. A list of algorithms selected at the Project level is displayed.

  6. Select the desired algorithm sub-category using a checkbox.

  7. Pre-defined code for the selected algorithm type is automatically added to the code cell.

  8. Define necessary variables in the code cell, such as:

    • Data columns

    • Target column

  9. Run the code cell.

  10. After execution, predictions or results based on the test dataset appear below the code cell.

Note: To view results, ensure the code cell containing the dataset details is executed.

Saving and Registering Algorithm-Based Models

  • After training, the model can be saved under the Models tab.

  • Algorithm-based models can be registered for use inside the Data Pipeline module.

  • Models can also be exported as API services.

    • Refer to Register a Model as an API Service for detailed instructions.

List of Available Algorithms

The Algorithms section provides pre-built solutions across five key categories:

  1. Regression – Standard regression models for numerical prediction.

    Unlock predictive insights with various regression techniques tailored for accurate data modeling. The supported Regression Algorithms within the Data Science Module are:

    • Linear Regression

    • SVR

    • KNN Regressor

    • Bagging Regressor

    • Decision Tree Regressor

    • Random Forest Regressor

    • Extremely Randomized Trees Regressor

    • AdaBoost Regressor

    • GBM Regressor

    • XGBoost Regressor

  2. Classification – Supervised learning for categorical outcomes.

    Leverage advanced classification algorithms to categorize data and enhance decision-making.

    • AdaBoost Classifier

    • Logistic Regression

    • Decision Tree Classifier

    • Random Forest Classifier

    • SVC

    • XGBoost Classifier

    • Bagging Classifier

    • GBM Classifier

    • Extremely Randomized Trees Classifier

    • Bayes Classifier

    • LGBM Classifier

    • Catboost Classifier

    • KNN Classifier

  3. Forecasting – Time series prediction (requires admin enablement).

    Accurately anticipate trends and future outcomes using cutting-edge forecasting algorithms.

    • ARIMA(X)

    • SARIMA (X)

    • Auto ARIMA

    • Exponential Smoothing

    • N-BEATS, Prophet

    • Random Forest

  4. Unsupervised Learning – Clustering and dimensionality reduction (requires admin enablement).

    1. These algorithms are mainly used to discover hidden patterns in data without pre-labeled outcomes.

      • Clustering

        • KMeans

        • KMeans++

        • Spectral Clustering

        • Agglomerative Clustering

        • DBSCAN

        • OPTICS

      • Anomaly Detection

        • Elliptic Envelope

        • Local Outlier Factor

        • One Class SVM

        • SGD One Class SVM

        • Isolation Forest

  5. Natural Language Processing (NLP) – Text-based algorithms for language data (requires admin enablement).

    Harness the power of NLP to derive meaningful insights from unstructured text data.

    The user needs to apply all the listed NLP algorithms to perform text analysis and get meaningful output from it.

    • Sequence classification: Sentiment Analysis, Topic Labelling, Zero-shot Classification

    • Token Classification: Named Entity Recognition, Part of Speech Tagging

    • Summarization

Note: By default, all users have access to Regression and Classification algorithms. Access to Forecasting, Unsupervised Learning, and NLP sub-categories must be enabled by an administrator.