Workflow 4

To build a Sentiment Analysis model in DS Lab, register it as an API through the Admin Module, and validate its response via API requests.

In Workflow 4, we explore the ease and effectiveness of creating a Sentiment Analysis Model within the DS Lab Notebook. Once the model is built, we use the API Client Registration feature of the Admin Module to register the model as an API.

After registration, the model is tested by sending an API request directly from the notebook, allowing us to validate its response against input data. This workflow showcases how DS Lab enables seamless model development, deployment, and real-time consumption through APIs.

DS Lab Project Creation

Here, we will walk through the step-by-step process of creating a new Data Science Lab project and performing a variety of data science tasks on the dataset.

From the BDB Homepage, click the Apps icon and select the DS Lab module.
Create a new DS Lab project by clicking the Create button.
Provide a Project Name and Description.
Select the Algorithm type as Classification.
Specify the Environment and allocate the required Resources.
Set the Idle Shutdown Time.
In the External Library section, add the required libraries, such as spaCy and tqdm.

· This will eliminate the need to download these libraries within the notebook.

· Finally, save your project.

Create Sentiment Analysis Model

· Activate the Project by clicking the Activate option on the right.

· Once activated, open the project by clicking View, located to the right of the project name.

· Wait for the kernel to start.

· In the Repo section, click the three dots and select Import.

· Provide an appropriate name and description for the notebook.

· Choose the notebook file from your local system and upload it.

· Click the Data icon on the left side of the workspace to add a sandbox as data.

· After clicking the Data icon, click the ‘+’ icon located to the right of the search bar.

· An Add Data page will appear. From the Data Source dropdown, select Data Sandbox Files.

· Click the Upload option on the right side of the page.

· Provide a name and description for the sandbox, then select the sandbox file from your local

system i.e. “sentiment data”

· Click Save to upload the file. Once uploaded successfully, a confirmation message will appear: “File is uploaded”.

· Select the checkbox next to the newly uploaded sandbox and click Add to add it as data.

· In the notebook cell, click the checkbox for the sandbox under the Data section.

· A code snippet will automatically appear in the cell to load the sandbox data.

· Run the cell using the Run Cell icon in the top-left corner of the cell.

Now in the imported notebook you will see the code using which sentiment analysis model will be created.

HERE IS THE COMPLETE EXPLANATION OF THE CODE:

1. Keep only the review text and rating

df = df[['reviewText', 'overall']]

Takes only the reviewText and overall columns from your DataFrame df.

Map numeric ratings to textual sentiment

df['sentiment'] = df['overall'].map({

5 : 'positive',

4 : 'positive',

3 : 'neutral',

2 : 'negative',

1 : 'negative'

})

Creates a new column sentiment where 5/4 → "positive", 3 → "neutral", 2/1 → "negative".

3. Drop the original numeric rating column

df = df.drop(columns='overall')

Removes overall so the DataFrame now contains reviewText and sentiment.

4. Install spaCy & download small English model (Jupyter cell magic)

%%bash

pip install spacy tqdm

python -m spacy download en_core_web_sm

This runs shell commands from a notebook cell. Note: you install spaCy and download the model, but nothing in the later code uses spaCy — so this is unnecessary unless you plan extra NLP preprocessing.

5. Subset to first 200 rows

# subset and train only on the first 200 rows of data

data = df.iloc[:200].copy()

text_col = 'reviewText'

Creates a small dataset of 200 samples to speed up experiments. text_col is the column containing text.

After running the above code, navigate to the algorithm section(situated at the same place as ‘data’ section). Click on the classification and then choose the ‘logistics regression’ option

Before choosing the logistic regression do not forget to click on the cell where you want to auto generate the code.

Once you complete this, a code will be auto generated this code will help you create the model with more efficiency.

Run the cell to apply logistic regression on the data.

TF-IDF vectorization

from sklearn.feature_extraction.text import TfidfVectorizer

tfidf = TfidfVectorizer()

X_tfidf = tfidf.fit_transform(data[text_col])

TfidfVectorizer() learns vocabulary & idf from data[text_col] and transforms the texts into a sparse matrix X_tfidf of shape (200, vocab_size).

Important: They call fit_transform on the entire data before train/test split — this leaks information from test to train (see “issues” below).

7. Train / test split

from sklearn.model_selection import train_test_split

X, y = X_tfidf, data['sentiment']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

Splits the data: 67% train, 33% test. random_state=42 makes the split reproducible.

8. Fit Logistic Regression

from sklearn.linear_model import LogisticRegression

sentiment = LogisticRegression()

sentiment.fit(X_train,y_train);

Fits a logistic regression classifier on the TF-IDF features. Default solver is used (usually lbfgs), and default max_iter (could be insufficient for larger data).

9. Check accuracy (unused value)

sentiment.score(X_test,y_test)

This computes the accuracy on the test set but the result is not printed or saved.

Predict and print classification reports

y_pred_train = sentiment.predict(X_train)

y_pred_test = sentiment.predict(X_test)

print(classification_report(y_train, y_pred_train))

print(classification_report(y_test, y_pred_test))

classification_report shows precision, recall, f1-score and support for each sentiment class for both train and test sets.

11. Prepare a single sample for API testing

api_test_input = X_tfidf[0].todense().tolist()

Takes the first TF-IDF row (a sparse vector), converts it to a dense nested Python list. This becomes the JSON payload. (Converting to dense is memory-heavy if the vector is large; for one sample it's okay but not scalable.)

Saving and Registering a Model in DS Lab

 Run each cell of the notebook to avoid errors.

 In the cell menu, click the three dots on the right-hand side and choose Save Model to auto-populate the code.

 Specify the model, model name, and model type for saving.

 Execute the cell to save the model.

 Once the model is saved, navigate to the Models tab on the left side of the DS Lab module (above the AutoML section) and select the model.

 Click the Register icon (upward arrow) in the top-right corner of the model page to register it.

Model as API and API Client Registration

Here, we will cover the step-by-step process of Registering the model as an API and utilize the admin module for API Client Registration.

To Begin,

· Register the model as an API by navigating to the model tab and selecting the desired model for registration.

· Once selected, proceed to register it as an API by clicking on “register as API” option.

· Provide necessary information, such as instance and resources, and save the registration details.

· For API client registration, this task is restricted to administrators only and not available to regular users.

· Users can use the information provided in the attached video or document for experimentation purposes.

· Access the admin plugin in the apps menu to initiate API client registration.

· Two options will be presented: internal and external. Choose "internal" for the current task.

· Enter the client's name and email address, where the client will receive API information.

· Provide the App name, request per hour, and request per day limits for the client's API usage.

· Select the specific model (previously registered) for the API client.

· Save the registration details to complete the process.

· To share API credentials with the client, you can send an email by clicking on the icon representing the secret ID and secret key.

· Alternatively, you can click on the edit icon to access and provide the necessary details directly.

Test Model As API

Here, we will cover the step-by-step process to test the model as an API within the DS Lab notebook by sending an API request to obtain the sentiment response.

To Begin,

· Open DS lab and navigate to your same notebook for testing the model as an API.

· Send HTTP POST request to an external API

import requests, json

url = "https://app.bdb.ai/services/api/sentiment_clf_test.dill"

payload = json.dumps(api_test_input)

headers = {

'clientid': 'MAVQXGKARSVUUIPFUCIH@5980',

'clientsecret': 'YSFQSGQSDILQLVHQGVXB1689039162745',

'appname': 'test1',

'Content-Type': 'application/json'

}

response = requests.request("POST", url, headers=headers, data=payload)

print([i['predictions'] for i in response.json()])

[NOTE: Use this code send request to the api]

· Builds payload (JSON) and sends it with headers containing clientid and clientsecret (these are secrets — don’t hardcode them).

· Uses requests.request("POST", ...). That’s fine; requests.post(...) is equivalent and a bit clearer.

· Assumes the returned JSON is a list of dicts with a 'predictions' key and prints those predictions.

· Create a variable, 'api_test_input', to store the input data for the model.

· Import the 'requests' and 'json' modules to work with API requests and JSON data.

· Set up the payload with the required input data for the API request.

· Ensure you have the necessary header details like client ID, client secret, and app name, which can be found in the API client registration section or copied from the email received during registration.

· Execute the code to send the API request and receive the output response.

· Print the sentiment response obtained from the API.

· Remember to save the notebook after completing the testing process to retain the changes and results.

This completes your worklfow 4 of DS Lab Module.

PreviousWorkflow 3 NextWorkflow 5

Last updated 27 days ago