带有 TensorFlow 2.4+ 错误的 SHAP DeepExplainer

本文介绍了带有 TensorFlow 2.4+ 错误的 SHAP DeepExplainer的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 DeepExplainer 计算 shap 值，但出现以下错误:

不再支持keras，请改用tf.keras

即使我使用的是 tf.keras?

KeyError 回溯(最近一次调用最后一次)在6 # ...或直接传递张量7 解释器 = shap.DeepExplainer((model.layers[0].input, model.layers[-1].output), 背景)8 shap_values = Explainer.shap_values(X_test[1:5])C:\ProgramData\Anaconda3\lib\site-packages\shap\explainers\_deep\__init__.py in shap_values(self, X, rating_outputs, output_rank_order, check_additivity)122 个被选为顶级".第124回C:\ProgramData\Anaconda3\lib\site-packages\shap\explainers\_deep\deep_tf.py in shap_values(self, X,ranked_outputs, output_rank_order, check_additivity)310 # 将属性分配给输出数组的右侧部分311 对于 l 范围内(len(X)):第 312 章313314 output_phis.append(phis[0]如果不是self.multi_input elsephis)C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)2798 如果 self.columns.nlevels > 1:2799 返回 self._getitem_multilevel(key)2800 索引器 = self.columns.get_loc(key)2801 如果 is_integer(索引器):2802 索引器 = [索引器]C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)2646 返回 self._engine.get_loc(key)2647 除了 KeyError:第2648章2649 indexer = self.get_indexer([key],method=method,tolerance=tolerance)2650 如果 indexer.ndim > 1 或 indexer.size > 1:pandas\_libs\index.pyx 在 pandas._libs.index.IndexEngine.get_loc()pandas\_libs\index.pyx 在 pandas._libs.index.IndexEngine.get_loc()pandas\_libs\hashtable_class_helper.pxi 在 pandas._libs.hashtable.PyObjectHashTable.get_item()pandas\_libs\hashtable_class_helper.pxi 在 pandas._libs.hashtable.PyObjectHashTable.get_item()密钥错误:0

导入形状将 numpy 导入为 np将熊猫导入为 pd将张量流导入为 tf将 tensorflow.keras.backend 导入为 K从 keras.utils 导入到_categorical从 sklearn.model_selection 导入 train_test_split从 tensorflow.python.keras.layers 导入密集从 tensorflow.python.keras 导入顺序从 tensorflow.keras 导入优化器# 将 JS 可视化代码打印到 notebookshap.initjs()X_train,X_test,Y_train,Y_test = train_test_split(*shap.datasets.iris(), test_size=0.2, random_state=0)Y_train = to_categorical(Y_train, num_classes=3)Y_test = to_categorical(Y_test, num_classes=3)# 定义基线模型模型 = tf.keras.models.Sequential()model.add(tf.keras.layers.Dense(8, input_dim=len(X_train.columns), activation=relu"))model.add(tf.keras.layers.Dense(3, activation=softmax"))模型摘要()# 编译模型model.compile(optimizer='adam', loss="categorical_crossentropy", metrics=['accuracy'])hist = model.fit(X_train, Y_train, batch_size=5,epochs=200,verbose=0)# 选择一组背景示例来接受期望背景 = X_train.iloc[np.random.choice(X_train.shape[0], 100, replace=False)]# 解释模型的预测#explainer = shap.DeepExplainer(模型，背景)# ...或直接传递张量解释器 = shap.DeepExplainer((model.layers[0].input, model.layers[-1].output), 背景)shap_values = Explainer.shap_values(X_test[1:5])

解决方案

TL;DR

在 TF 2.4+ 的顶部添加 tf.compat.v1.disable_v2_behavior()
在 numpy 数组上计算 shap 值，而不是在 df 上

完全可重现的示例:

导入形状将 numpy 导入为 np将熊猫导入为 pd从 sklearn.model_selection 导入 train_test_split将张量流导入为 tftf.compat.v1.disable_v2_behavior() # <-- 这里！将 tensorflow.keras.backend 导入为 K从 tensorflow.keras.utils 导入到_categorical从 tensorflow.python.keras.layers 导入密集从 tensorflow.python.keras 导入顺序从 tensorflow.keras 导入优化器打印(SHAP版本是:"，shap.__version__)print("Tensorflow 版本为:", tf.__version__)X_train, X_test, Y_train, Y_test = train_test_split(*shap.datasets.iris(), test_size=0.2, random_state=0)Y_train = to_categorical(Y_train, num_classes=3)Y_test = to_categorical(Y_test, num_classes=3)# 定义基线模型模型 = tf.keras.models.Sequential()model.add(tf.keras.layers.Dense(8, input_dim=len(X_train.columns), activation=relu"))model.add(tf.keras.layers.Dense(3, activation=softmax"))#model.summary()# 编译模型model.compile(优化器=adam"，损失=categorical_crossentropy"，metrics=[accuracy"])hist = model.fit(X_train，Y_train，batch_size=5，epochs=200，verbose=0)# 选择一组背景示例来接受期望背景 = X_train.iloc[np.random.choice(X_train.shape[0], 100, replace=False)]解释器 = shap.DeepExplainer((model.layers[0].input, model.layers[-1].output), 背景)shap_values = Explainer.shap_values(X_test[:3].values) # <-- 这里！# 将 JS 可视化代码打印到 notebookshap.initjs()shap.force_plot(解释器.expected_value[0], shap_values[0][0], feature_names=X_train.columns)

SHAP 版本为:0.39.0Tensorflow 版本为:2.5.0

I'm trying to compute shap values using DeepExplainer, but I get the following error:

Even though i'm using tf.keras?

KeyError       Traceback (most recent call last)
 in
6 # ...or pass tensors directly
7 explainer = shap.DeepExplainer((model.layers[0].input, model.layers[-1].output), background)
8 shap_values = explainer.shap_values(X_test[1:5])

C:\ProgramData\Anaconda3\lib\site-packages\shap\explainers\_deep\__init__.py in shap_values(self, X, ranked_outputs, output_rank_order, check_additivity)
122   were chosen as "top".
124   return self.explainer.shap_values(X, ranked_outputs, output_rank_order, check_additivity=check_additivity)
C:\ProgramData\Anaconda3\lib\site-packages\shap\explainers\_deep\deep_tf.py in shap_values(self, X, ranked_outputs, output_rank_order, check_additivity)
310                 # assign the attributions to the right part of the output arrays
311                 for l in range(len(X)):
312                     phis[l][j] = (sample_phis[l][bg_data[l].shape[0]:] * (X[l][j] - bg_data[l])).mean(0)
313
314             output_phis.append(phis[0] if not self.multi_input else phis)

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)

    2798 if self.columns.nlevels > 1:
    2799    return self._getitem_multilevel(key)
    2800    indexer = self.columns.get_loc(key)
    2801 if is_integer(indexer):
    2802    indexer = [indexer]
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2646                 return self._engine.get_loc(key)
2647             except KeyError:
2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2650         if indexer.ndim > 1 or indexer.size > 1:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

import shap
import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow.keras.backend as K

from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras import Sequential
from tensorflow.keras import optimizers

# print the JS visualization code to the notebook
shap.initjs()

X_train,X_test,Y_train,Y_test = train_test_split(*shap.datasets.iris(), test_size=0.2, random_state=0)

Y_train = to_categorical(Y_train, num_classes=3)
Y_test = to_categorical(Y_test, num_classes=3)

# Define baseline model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(8, input_dim=len(X_train.columns), activation="relu"))
model.add(tf.keras.layers.Dense(3, activation="softmax"))
model.summary()


# compile the model
model.compile(optimizer='adam', loss="categorical_crossentropy", metrics=['accuracy'])

hist = model.fit(X_train, Y_train, batch_size=5,epochs=200, verbose=0)

# select a set of background examples to take an expectation over
background = X_train.iloc[np.random.choice(X_train.shape[0], 100, replace=False)]

# Explain predictions of the model
#explainer = shap.DeepExplainer(model, background)
# ...or pass tensors directly
explainer = shap.DeepExplainer((model.layers[0].input, model.layers[-1].output), background)
shap_values = explainer.shap_values(X_test[1:5])

解决方案

TL;DR

Full reproducible example:

import shap
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

import tensorflow as tf
tf.compat.v1.disable_v2_behavior() # <-- HERE !

import tensorflow.keras.backend as K
from tensorflow.keras.utils import to_categorical
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras import Sequential
from tensorflow.keras import optimizers

print("SHAP version is:", shap.__version__)
print("Tensorflow version is:", tf.__version__)

X_train, X_test, Y_train, Y_test = train_test_split(
    *shap.datasets.iris(), test_size=0.2, random_state=0
)

Y_train = to_categorical(Y_train, num_classes=3)
Y_test = to_categorical(Y_test, num_classes=3)

# Define baseline model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(8, input_dim=len(X_train.columns), activation="relu"))
model.add(tf.keras.layers.Dense(3, activation="softmax"))
# model.summary()

# compile the model
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

hist = model.fit(X_train, Y_train, batch_size=5, epochs=200, verbose=0)

# select a set of background examples to take an expectation over
background = X_train.iloc[np.random.choice(X_train.shape[0], 100, replace=False)]

explainer = shap.DeepExplainer(
    (model.layers[0].input, model.layers[-1].output), background
)
shap_values = explainer.shap_values(X_test[:3].values) # <-- HERE !

# print the JS visualization code to the notebook
shap.initjs()
shap.force_plot(
    explainer.expected_value[0], shap_values[0][0], feature_names=X_train.columns
)

SHAP version is: 0.39.0
Tensorflow version is: 2.5.0

这篇关于带有 TensorFlow 2.4+ 错误的 SHAP DeepExplainer的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！