SHAP Plots with treeinterpreter

In this notebook, we’ll demonstrate how to create SHAP-type plots using the treeinterpreter package along with SHAP. SHAP (SHapley Additive exPlanations) plots are a popular method for interpreting machine learning models by showing the contribution of each feature to a specific prediction. treeinterpreter is a tool that breaks down the predictions of tree-based models (e.g., Random Forests) into individual feature contributions. By combining these two tools, we can visualize and interpret the impact of features on model predictions.

Table of Contents

Prerequisites

Before running the examples, ensure that you have the following libraries installed: - shap - treeinterpreter - scikit-learn

You can install them using:

pip install shap treeinterpreter scikit-learn
#!pip install shap treeinterpreter
import shap
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_wine
from sklearn.datasets import load_iris
from treeinterpreter import treeinterpreter as ti
from sklearn.ensemble import RandomForestClassifier

Example 1: Iris Dataset

This script loads the Iris dataset, trains a RandomForestClassifier model, and uses TreeInterpreter and SHAP to analyze feature contributions.

Steps:

  1. Load the Iris dataset and train a RandomForestClassifier model.
  2. Use TreeInterpreter to obtain predictions, biases, and feature contributions.
  3. Focus on contributions for one class (class 0 in this example).
  4. Create a SHAP Explanation object.
  5. Generate various SHAP plots including a summary plot, a waterfall plot for the first instance, and a bar plot of the mean absolute SHAP values across all features.
import numpy as np
import shap
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from treeinterpreter import treeinterpreter as ti
import matplotlib.pyplot as plt

# Load dataset and train the model
data = load_iris()
X, y = data.data, data.target
model = RandomForestClassifier()
model.fit(X, y)

# Use treeinterpreter to get the prediction, bias, and contributions
prediction, bias, contributions = ti.predict(model, X)

# Contributions.shape is (n_samples, n_features, n_classes)
# We reduce the dimensionality by selecting one class
shap_values = contributions[:, :, 0]  # Choose class 0 for visualization

# Creating a SHAP Explanation object
explainer = shap.Explainer(model)
shap_object = shap.Explanation(
    values=shap_values,
    base_values=bias[:, 0],  # Base values should match the selected class
    data=X,
    feature_names=data.feature_names
)

# Generate SHAP summary plot (beeswarm plot) and modify the x-axis label directly
shap.summary_plot(shap_object.values, shap_object.data, feature_names=shap_object.feature_names, show=False)
plt.gca().set_xlabel("CFC values")  # Modify the x-axis label
plt.show()  # Display the plot with the updated label

Generate SHAP waterfall plot for the first instance

shap.waterfall_plot(shap_object[0])

For global importances, average the mean absolute values across all instances

mean_abs_shap_values = np.abs(shap_object.values).mean(axis=0)
shap.bar_plot(mean_abs_shap_values, feature_names=shap_object.feature_names)

Example 2: Wine Dataset

This script demonstrates the process of loading the Wine dataset, training a RandomForestClassifier model, and using TreeInterpreter and SHAP to analyze feature contributions. It includes generating several SHAP plots for visualizing the contributions.

Steps:

  1. Load the Wine dataset and train a RandomForestClassifier model.
  2. Use TreeInterpreter to obtain predictions, biases, and feature contributions.
  3. Focus on contributions for one class (class 0 in this example).
  4. Create a SHAP Explanation object.
  5. Generate SHAP plots including a summary plot, a waterfall plot for the first instance, and a bar plot showing the mean absolute SHAP values across all features.
import numpy as np
import shap
from sklearn.datasets import load_wine
from sklearn.ensemble import RandomForestClassifier
from treeinterpreter import treeinterpreter as ti
import matplotlib.pyplot as plt

# Load the Wine dataset and train the model
data = load_wine()
X, y = data.data, data.target
model = RandomForestClassifier()
model.fit(X, y)

# Use treeinterpreter to get the prediction, bias, and contributions
prediction, bias, contributions = ti.predict(model, X)

# contributions.shape is (n_samples, n_features, n_classes)
# We reduce the dimensionality by selecting one class
shap_values = contributions[:, :, 0]  # Choose class 0 for visualization

# Creating a SHAP Explanation object
explainer = shap.Explainer(model)
shap_object = shap.Explanation(
    values=shap_values,
    base_values=bias[:, 0],  # Base values should match the selected class
    data=X,
    feature_names=data.feature_names
)

# Generate SHAP summary plot (beeswarm plot) and modify the x-axis label directly
shap.summary_plot(shap_object.values, shap_object.data, feature_names=shap_object.feature_names, show=False)
plt.gca().set_xlabel("CFC values")  # Modify the x-axis label
plt.show()  # Display the plot with the updated label

Generate SHAP waterfall plot for the first instance

shap.waterfall_plot(shap_object[0])

For global importances, average the mean absolute values across all instances

mean_abs_shap_values = np.abs(shap_object.values).mean(axis=0)
shap.bar_plot(mean_abs_shap_values, feature_names=shap_object.feature_names)

Approximation of SHAP by CFC

The treeinterpreter package allows decomposing each prediction (from a tree based model) into bias and feature contribution components as described in http://blog.datadive.net/interpreting-random-forests/.

For a dataset with \(p\) features, each prediction on the dataset is decomposed as prediction = bias + feature_1_contribution + ... + feature_p_contribution. From hereon we refer to these “contributions” as Conditional Feature Contributions (CFCs).

CFCs provide local, case-specific explanations for predictions by tracing the decision path and assigning changes in the model’s expected output to individual features along that path. However, CFCs are prone to bias, which increases with distance from the root of the decision tree. An increasingly popular alternative, SHapley Additive exPlanation (SHAP) values, seem to reduce this bias but come with a significantly higher computational cost.

(Loecher, Lai, and Qi 2022) present an empirical comparison of the explanations produced by both methods, using a dataset of 164 publicly available classification problems. For random forests and boosted trees, they observe remarkably high correlations and similarities between local and global SHAP values and CFC scores, resulting in comparable rankings and interpretations. These findings also extend to global feature importance scores, reinforcing their use as proxies for the predictive power of individual features.

“We generated several thousand feature subsets for each dataset, re-trained models for each subset, and then measured the correlation between the model loss and the total importance of the included features”. In particular, the feature subsets were independently sampled as follows: sample the subset’s cardinality k ∈ 0, 1,.., d uniformly at random, then select k elements from 1,.., d uniformly at random. We computed both SHAP/CFC and the retrained models’ loss (either negative log loss or rmse, respectively) on a test set. This strategy yields one correlation coefficient for SHAP and CFC for each of the data sets described in Sect. 2.1.”

The following figure depicts these correlations for random forests and boosted trees as a scatterplot with marginal distributions


References

Loecher, Markus, Dingyi Lai, and Wu Qi. 2022. “Approximation of SHAP Values for Randomized Tree Ensembles.” In International Cross-Domain Conference for Machine Learning and Knowledge Extraction, 19–30. Springer.