Automate Data Analysis using API, AutoML, & Database Integration
To ingest hiring data from an API, prepare and analyze it using AutoML, and store the results in a database for actionable insights and reporting.
This guide explains how to ingest hiring data from an API, prepare it for analysis, run an AutoML model for insight generation, and store the processed data in a database for reporting.
The workflow integrates four key BDB Platform modules:
Data Center for sandbox creation
Data Preparation for transformation.
AutoML for model training and inference
Data Pipeline for automation and integration
By the end of this guide, you’ll have a fully automated machine learning workflow that performs real-time data ingestion, transformation, analysis, and output storage for actionable business insights.
Architecture Overview
Workflow Flow
Stage
Process
Module Used
1
Data ingestion from API
Data Pipeline
2
Data cleaning and transformation
Data Preparation
3
Model training and inference
AutoML (DS Lab)
4
Output writing into database
Data Pipeline (DB Writer)
High-Level Flow Diagram
API Source → Data Preparation → AutoML → Database (ClickHouse)
Step 1: API Integration
Purpose
To retrieve real-time hiring data, including job listings, candidate profiles, and recruitment metrics from an API source.
Procedure
Navigate to the BDB Platform Homepage.
Click the Apps icon → Select the Data Pipeline module.
Click Create Pipeline under the Pipeline tab.
Enter:
Pipeline Name
Description
Resource Allocation (Low/Medium/High)
Click Save.
Add API Ingestion Component
Open the Components Palette by clicking the “+” icon (if not visible).
In the search bar, type API Ingestion.
Drag and drop the component onto the canvas.
Configure the component as follows:
Invocation Type: Real-Time
Ingestion Type: API Ingestion
Click Save to finalize.
Add Kafka Event
From the Event Panel (right side), click the “+” icon to create a Kafka event.
Drag and drop it onto the pipeline canvas.
The Kafka Event automatically connects to the API Ingestion component.
Step 2: Create a Sandbox and Upload the CSV File
Purpose
To create a local workspace for sample hiring data (used for AutoML training and validation).
Procedure
From the BDB Homepage, open the Data Center.
Click the Sandbox tab → Select Create.
Upload your CSV file by:
Drag-and-drop, or
Clicking Browse to locate the file.
Once the upload is complete, click Upload.

Step 3: Data Preparation
Purpose
To clean, transform, and structure raw hiring data for accurate model training and prediction.
Procedure
In the Sandbox List, click the three dots (⋮) next to your created sandbox.
Select Create Data Preparation.
Apply Transformations
Perform the following cleaning actions:
Transformation
Action
Result
Delete Column
Select Gender column → Click Transforms → Delete Column
Removes redundant field
Remove Empty Rows
Select Previous CTC and Offered CTC → Click Transforms → Delete Empty Rows
Removes incomplete entries
Finalize Preparation
Rename the preparation for easy reference.
Review transformation steps under the Steps tab.
Click Save to complete.

Step 4: Create and Run an AutoML Experiment
Purpose
To train and evaluate machine learning models automatically for predictive analytics on hiring data.
Procedure
Open the DS Lab module from the Apps Menu.
Go to the AutoML section and click Create Experiment.
Configure:
Experiment Name:
Hiring DataExperiment Type:
Classification
Under Configure Dataset:
Dataset Source: Sandbox
File Type: CSV
Select Sandbox: Choose your sandbox dataset
Under Advanced Information:
Data Preparation: Select the preparation created in Step 3
Target Column:
Gender
Click Save to start the experiment.
Monitor AutoML Execution
AutoML will train and test multiple models.
Once complete, click View Report to review:
Model performance metrics
Accuracy comparison
Recommended best-fit model

Step 5: Register the Best Model
Purpose
To register the trained AutoML model so it can be reused in pipelines for real-time or batch predictions.
Procedure
Navigate to the Model section under DS Lab.
Select the desired model from your AutoML results.
Click the Register (arrow) icon.
Confirm registration.

Step 6: Add Data Preparation Component to Pipeline
Purpose
To integrate the previously created data preparation logic into the pipeline for consistent transformation.
Procedure
From the Components Palette, search Data Preparation.
Drag and drop onto the pipeline canvas.
Configure:
Invocation Type: Batch
Data Center Type: Data Sandbox
Sandbox Name: Select the sandbox
Preparation: Choose the saved data preparation
Save configuration.
From the Event Panel, click + → Add a Kafka Event, then connect it.
Step 7: Add AutoML Component
Purpose
To execute the registered AutoML model for predictive analysis on live or processed hiring data.
Procedure
From the Components Palette, search for AutoML Component.
Drag and drop onto the canvas.
Configure:
Invocation Type: Batch
Model Name: Select your registered AutoML model
Save the component.

Add and connect a Kafka Event.
Step 8: Add DB Writer Component
Purpose
To store processed predictions and enriched data into the target database for dashboards and reporting.
Procedure
From the Writer Section, drag and drop the DB Writer component onto the canvas.
Configure:
Invocation Type: Batch
Database Driver: ClickHouse
Save Mode: Append
Fill in Meta Information:
Host
Port
Database Name
Table Name
Username
Password
Validate the connection.
Click Save.
Step 9: Activate and Execute the Pipeline
Purpose
To run the end-to-end workflow and verify data ingestion, transformation, model execution, and output storage.
Procedure
Click the Activate icon on the pipeline toolbar.
Wait until all pods are deployed and running.
Monitor the Logs panel to view real-time execution details.
Component Flow:
API Ingestion > Kafka > Data Preparation > Kafka > AutoML > Kafka > DB Writer

Validate Execution
Open the Preview Tab for each Kafka event to inspect intermediate data.
Confirm:
API ingestion messages are received successfully.
Transformations from Data Preparation are applied.
AutoML component returns predictions.
DB Writer inserts records into the ClickHouse table.
Step 10: Test the API Ingestion via Postman
Purpose
To simulate incoming hiring data using the generated API Ingestion endpoint.
Procedure
Open Postman.
Create a New POST Request using the generated Ingestion URL.
Add the following headers:
Ingestion IDIngestion Secret
In the Body Tab, choose:
Format: raw → JSON
Add sample JSON matching your model schema:
{ "Candidate_ID": "C1234", "Experience": 5, "Previous_CTC": 800000, "Offered_CTC": 950000, "Location": "Bangalore" }
Click Send.

Expected Response on success:
API Ingestion successfulwrote
Step 11: Verify Output and Deactivate Pipeline
Go to the ClickHouse database and confirm the table contains prediction results.
Validate the schema and records against the ingested dataset.
Once confirmed, Deactivate the Pipeline to stop execution and release resources.
Monitoring and Troubleshooting
Issue
Possible Cause
Resolution
API Ingestion not receiving data
Invalid credentials or endpoint
Recheck Ingestion ID/Secret
Data Prep error
Mismatch between schema and source
Validate preparation mapping
AutoML failure
Model not registered
Register model before adding it to pipeline
DB Writer error
Database connection issue
Verify host, port, and authentication
Outcome
By following this guide, you have successfully:
Ingested real-time hiring data from an API
Transformed and cleaned the dataset using Data Preparation
Applied an AutoML model for predictions
Stored results into a ClickHouse database
These outputs can now be used for reporting, dashboards, and recruitment performance analytics within the BDB Platform.
Key Benefits
Capability
Advantage
API Integration
Real-time hiring data ingestion
Data Preparation
Ensures clean, consistent, and accurate data
AutoML
Automatic model selection and insights generation
Database Integration
Centralized access for analytics and reporting
Summary
This workflow delivers a complete, production-ready machine learning pipeline — from API ingestion to predictive insights storage. By automating ingestion, transformation, and analysis, organizations can monitor hiring patterns, forecast recruitment metrics, and drive data-backed talent decisions seamlessly through the BDB Platform.