BDB 10.0 Documentation

Ctrlk

Operations

This section describes the various operations available within a Data Science Project.

Note: The available operations may vary depending on the project environment in which the project is created.

Projects created under the PySpark environment support only a subset of operations:
- Data, Secrets, Variable Explorer, and Writers.
Projects created under Python TensorFlow or Python PyTorch environments support a broader set of operations, providing advanced functionality for machine learning workflows.

Operations in a Data Science Notebook

1. Data

Add data to the notebook for analysis and modeling.
View a list of all datasets added to the notebook.
Supports multiple file types and sources.

2. Secrets

Create and manage environment variables to store confidential information securely.
Prevent sensitive data such as API keys, passwords, or tokens from being exposed in notebook code.

3. Algorithms

Access algorithm settings at the project level.
Configure and use machine learning algorithms directly inside the notebook.
Supports project-level sharing of algorithm configurations across notebooks.

4. Transforms

Save and load models using transform scripts.
Register models or publish them as APIs through the DS Lab module.
Supports reproducible data transformations and workflow integration.

5. Models

Train, save, and load models using frameworks such as Scikit-learn, TensorFlow/Keras, and PyTorch.
Register models for use in pipelines or shared environments.
For detailed instructions, refer to Model Creation using Data Science Notebook.

6. Artifacts

Save plots, visualizations, and datasets as artifacts inside the notebook.
Artifacts provide a way to store and reuse results generated during experiments.

7. Variable Explorer

Inspect detailed information about variables declared in the notebook.
Monitor variable types, values, and memory usage for debugging and analysis.

8. Writers

Write experiment outputs to supported database writers.
Supports batch and incremental writes for integration with downstream pipelines or analytics systems.

Notes

The available operations are context-sensitive, depending on the environment in which the notebook is created.
Operations like Transforms, Models, and Artifacts are available only in TensorFlow or PyTorch environments.
PySpark notebooks are limited to Data, Secrets, Variable Explorer, and Writers operations.

PreviousCollaboration & Sharing NextData