Classification Model Explainer
The Classification Model Explainer in the Model Explainer provides a comprehensive suite of tools to interpret how a classification model makes decisions. It includes dedicated tabs that enable users to understand feature importance, individual prediction behavior, scenario testing, and feature dependencies through intuitive visualizations and interactive controls.
Feature Importance Tab
Purpose: The Feature Importance tab identifies which input features have the greatest impact on the model’s predictions, enabling users to interpret global model behavior.
Key interface elements:
Importance Type Selector:
Two available methods are shown:
SHAP Values (currently selected) – a model-agnostic explainability technique that quantifies how each feature contributes to the model’s output.
Permutation Importances – another option that measures how model performance changes when a feature’s values are randomly shuffled.
No. of Features Dropdown:
The user has chosen to display the Top 10 features ranked by their importance scores.
Visualization (Horizontal Bar Chart):
Displays the ranked list of features in descending order of impact, based on SHAP values.
Each bar’s length represents the magnitude of feature contribution to the model’s predictions.

Use Cases:
Identify the most influential drivers of model performance.
Detect irrelevant or redundant features that may be excluded during model retraining.
Validate whether influential features align with domain expertise and business intuition.
Classification Stats
The Classification Stats section provides analytical tools and visualizations that help users assess the predictive accuracy, discrimination power, and calibration of a classification model. These components collectively enable model validation, threshold optimization, and decision boundary analysis to ensure reliable deployment in real-world scenarios.
Global Cutoff
Purpose: The Global Cutoff defines the probability threshold that separates predicted positive outcomes from predicted negative outcomes. Adjusting this cutoff helps optimize model performance for specific business or operational objectives.
Functionality:
Predictions with probabilities above the cutoff are classified as positive, while those below the cutoff are classified as negative.
Users can alternatively define the cutoff as a percentile of all predicted probabilities.
Once set, the cutoff automatically propagates to connected visual and analytical components (e.g., Confusion Matrix, Classification Plot, Lift Curve).
Use Cases:
Optimize the trade-off between sensitivity (recall) and specificity.
Align prediction thresholds with business priorities (e.g., fraud detection sensitivity vs. false alarm rate).
Ensure consistency across all evaluation visualizations and reports.

Model Performance Metrics
Purpose: Displays a comprehensive summary of standard classification performance metrics to quantify the model’s predictive power and calibration.
Common Metrics Include:
Accuracy: Ratio of correctly predicted instances to total predictions.
Precision: Fraction of predicted positives that are actually positive.
Recall (Sensitivity): Fraction of actual positives correctly identified.
F1 Score: Harmonic mean of precision and recall.
AUC (Area Under Curve): Measures the model’s ability to distinguish between classes.
Specificity, Log Loss, and Matthews Correlation Coefficient (MCC) where applicable.
Use Cases:
Compare models quantitatively during evaluation or retraining cycles.
Identify strengths and weaknesses (e.g., precision-oriented vs. recall-oriented models).
Establish standardized metrics for governance and performance reporting.
3. Confusion Matrix
Purpose: The Confusion Matrix provides a visual summary of prediction outcomes by comparing predicted classes against actual observations.
Structure:
Predicted Negative
Predicted Positive
Actual Negative
True Negative (TN)
False Positive (FP)
Actual Positive
False Negative (FN)
True Positive (TP)
Interpretation:
True Positives (TP): Correctly predicted positive cases.
True Negatives (TN): Correctly predicted negative cases.
False Positives (FP): Incorrectly predicted positives (Type I error).
False Negatives (FN): Missed positives (Type II error).
The matrix helps visualize error trade-offs as the cutoff varies, enabling users to select an optimal threshold balancing business costs and risks.

4. Precision Plot
Purpose: The Precision Plot assesses the calibration of predicted probabilities by showing how well they align with observed positive outcomes.
Functionality:
Observations are grouped (binned) by similar predicted probabilities.
For each bin, the percentage of actual positives is plotted against the average predicted probability.
A perfectly calibrated model follows a diagonal line from (0,0) to (1,1).
Well-performing models display clear separation, with most observations near 0% or 100%.
Use Cases:
Evaluate probability calibration for risk scoring models.
Identify probability ranges where model confidence may be overstated or understated.
Support post-model calibration methods such as Platt scaling or isotonic regression.
5. Classification Plot
Purpose: The Classification Plot displays the fraction of each predicted class above and below the chosen cutoff, providing a visual summary of the model’s classification distribution.
Interpretation: This helps users quickly gauge the class balance, identify any bias toward one class, and visualize how the selected threshold affects predicted class proportions.
Use Cases:
Validate model output distribution across thresholds.
Detect class imbalance or prediction skew.
Optimize cutoff settings to achieve desired positive-class proportions.

6. ROC AUC Plot
Purpose: The Receiver Operating Characteristic (ROC) Curve illustrates the model’s capability to distinguish between classes at various thresholds.
Functionality:
X-axis: False Positive Rate (FPR) = FP / (FP + TN).
Y-axis: True Positive Rate (TPR) = TP / (TP + FN).
The AUC (Area Under Curve) quantifies the overall discriminative power—values closer to 1.0 indicate superior performance.
Use Cases:
Compare competing models’ discriminative abilities.
Select thresholds that maximize TPR while minimizing FPR.
Provide standardized visual evidence for model evaluation reports.
7. PR AUC Plot (Precision-Recall Curve)
Purpose: The Precision-Recall (PR) Curve shows the trade-off between precision and recall at different thresholds, especially informative for imbalanced datasets.
Interpretation: A model with high area under the PR curve (AUC-PR) maintains strong precision and recall across thresholds, indicating robust performance on minority classes.
Use Cases:
Evaluate classifier performance on imbalanced data (e.g., fraud, rare diseases).
Optimize decision thresholds for minority event detection.

8. Lift Curve
Purpose: The Lift Curve compares the effectiveness of the model’s predictions against random selection.
Functionality:
The curve plots the percentage of positive observations captured when selecting the top X% of records ranked by prediction score.
Lift is calculated as:
Lift=Positive rate with modelPositive rate with random selection\text{Lift} = \frac{\text{Positive rate with model}}{\text{Positive rate with random selection}}Lift=Positive rate with random selectionPositive rate with model
A lift greater than 1 indicates the model performs better than random guessing.
Use Cases:
Quantify the improvement over baseline for marketing, credit scoring, or risk models.
Evaluate return on targeting strategies using model-driven segmentation.
9. Cumulative Precision Plot
Purpose: The Cumulative Precision Plot shows the percentage of positive labels expected when sampling the top X% of records with the highest predicted scores.
Interpretation: This cumulative view helps decision-makers determine how many top-ranked records should be targeted to achieve a desired precision level.
Use Cases:
Identify optimal targeting segments (e.g., top 10% of customers likely to convert).
Support operational planning for limited resource allocation (e.g., audit prioritization, campaign selection).

Summary
Global Cutoff
Define probability threshold
Optimize classification decision boundary
Control model sensitivity/specificity
Model Performance Metrics
Summarize core KPIs
Quantify accuracy, precision, recall, F1
Benchmark and monitor model quality
Confusion Matrix
Visualize classification outcomes
Evaluate TP/FP/FN/TN trade-offs
Select optimal threshold
Precision Plot
Check calibration
Validate probability accuracy
Model reliability analysis
Classification Plot
Show class fractions vs cutoff
Identify bias and distribution
Threshold tuning
ROC AUC Plot
Compare TPR vs FPR
Evaluate separability
Model discrimination assessment
PR AUC Plot
Show Precision-Recall tradeoff
Evaluate minority-class performance
Imbalanced dataset analysis
Lift Curve
Compare to random selection
Quantify model’s predictive gain
Marketing and risk scoring
Cumulative Precision
Show top X% sampling precision
Optimize targeting or resource allocation
Prioritization strategies
Individual Predictions Tab
The Individual Predictions tab provides local explanations for single prediction instances, allowing users to trace specific outcomes back to their feature-level contributions.
This section of the Model Explainer provides a detailed breakdown of how individual predictions are formed for specific observations. It helps users understand how input features interact within the model to generate a given predicted probability for each target label. These insights enhance transparency, interpretability, and trust in AI-driven decision-making.
Select Index
The user can select a record directly by choosing it from the dropdown or by hitting the Random Index option to randomly select a record that fits the constraints. For example, the user can select a record where the observed target value is negative but the predicted probability of the target being positive is very high. This allows the user to sample only false positives or only false negatives.
Prediction
Purpose: Displays the predicted probability associated with each target label for the selected observation.
Functionality:
Shows the model’s confidence level in assigning an observation to a specific class or category.
Allows comparison between multiple class probabilities to identify the most likely predicted outcome.
Helps users validate if the model’s probability distribution aligns with domain expectations.
Use Cases:
Interpret classification model output beyond binary “positive” or “negative” labels.
Assess model confidence in multi-class scenarios (e.g., customer churn risk, product recommendation likelihood).
Support probability calibration and threshold tuning during model evaluation.

Contributions Plot
Purpose: The Contributions Plot provides a visual explanation of how each feature contributes to the prediction for an individual observation.
Functionality:
Starts from the population average prediction (baseline value).
Adds or subtracts feature-level contributions until reaching the final predicted probability for the selected record.
Positive contributions indicate features that increase the prediction value, while negative contributions indicate features that decrease it.
Interpretation: This visualization helps users understand precisely how the model builds each prediction, highlighting which variables were most influential for a particular instance.
Use Cases:
Provide transparency into model decision-making at the individual level.
Debug unexpected predictions by inspecting the contribution of key features.
Identify outliers or anomalous prediction behavior.
Partial Dependence Plot (PDP)
Purpose: The Partial Dependence Plot (PDP) illustrates how the model’s prediction changes as the value of one feature varies, holding other features constant.
Functionality:
X-axis: Represents the selected feature.
Y-axis: Represents the model’s predicted outcome or probability.
Grey Line: Shows the average effect of changing the feature across sampled observations.
Blue Line(s): Represent individual record trajectories, depicting how predictions evolve for specific observations.
User Controls: Allow adjustments for:
Number of observations sampled for averaging.
Number of gridlines (individual observations) shown.
Number of grid points (intervals) used to compute predictions along the x-axis.
Interpretation: The PDP provides insights into feature sensitivity and non-linear relationships—showing how model output reacts to variations in a single input variable.
Use Cases:
Detect critical thresholds or breakpoints (e.g., income, age, credit score).
Analyze monotonic or non-monotonic feature effects.
Guide feature engineering and inform business rule definitions.

Contributions Table
Purpose: The Contributions Table complements the visual Contributions Plot by providing a tabular summary of how each feature contributes numerically to the prediction for a specific observation.
Functionality:
Lists all input features with their corresponding contribution value and direction of influence.
The cumulative sum of all contributions (starting from the population average) equals the final predicted probability.
Allows precise quantification of feature effects for validation and reporting.
Interpretation: This tabular representation provides granular transparency into the prediction process, helping analysts and auditors trace the decision pathway behind any individual prediction.
Use Cases:
Support model explainability in audit and compliance scenarios.
Facilitate communication between data scientists and business users.
Enable comparison of feature contributions across multiple records.

Summary
Predicted Probability
Displays confidence per target label
Understand likelihood distribution
Validate prediction confidence
Contributions Plot
Visual breakdown of feature effects
See how individual features build the prediction
Explain model decision paths
Partial Dependence Plot
Shows sensitivity to one feature
Identify thresholds and non-linear relationships
Guide feature engineering
Contributions Table
Numeric breakdown of contributions
Quantify and audit prediction influence
Governance and validation
What-if Analysis Tab
Purpose: The What-if Analysis tab enables users to test hypothetical input scenarios and observe how real-time changes in features influence the model’s output.
Key Components:
Manual Input Modification: Allows direct adjustment of feature values (e.g., income, age, transaction amount) for the selected record.
Real-time Prediction Updates: The dashboard dynamically recalculates predictions with modified inputs.
Sensitivity Testing: Supports experimentation to evaluate the stability of the model under changing conditions.
Use Cases:
Evaluate fairness by simulating demographic or socioeconomic variations.
Test operational thresholds (e.g., determine at what credit score a loan is approved).
Assess the robustness and resilience of predictions to small perturbations in feature values.
Feature Input
The user can adjust the input values to see predictions for what-if scenarios.

Feature Dependence Tab
Purpose: The Feature Dependence tab visualizes the relationship between an individual feature and the model’s predicted outcome using Partial Dependence Plots (PDPs) or ICE (Individual Conditional Expectation) curves.
Shap Summary
The Shap Summary summarizes the Shap values per feature. The user can either select an aggregate display that shows the mean absolute Shap value per feature or get a more detailed look at the spread of Shap values per feature and how they correlate with the feature value (red is high).
Shap Dependence
This plot displays the relation between feature values and Shap values. This allows you to investigate the general relationship between feature value and impact on the prediction. The users can check whether the model uses features in line with their intuitions, or use the plots to learn about the relationships that the model has learned between the input features and the predicted outcome.

Use Cases:
Identify key inflection points (e.g., optimal credit score or income level).
Understand monotonic or non-monotonic feature effects on prediction outcomes.
Inform feature engineering strategies and guide business rule formulation.
Summary
Each tab in the Model Explainer serves a specific interpretability objective:
Feature Importance
Global influence across all predictions
Understand dominant drivers
Individual Predictions
Local explanation for single instance
Explain individual outcomes
What-if Analysis
Scenario-based input testing
Evaluate model sensitivity and fairness
Feature Dependence
Relationship visualization
Reveal trends and thresholds for business decisions
Last updated