I am trying to create shap values for a single row for the local explanation but I am consistently getting this error. I tried various methods but still couldn't able to fix them.
Things I did so far -
created the randomized decision tree model -
from sklearn.ensemble import ExtraTreesRegressor
extra_tree = ExtraTreesRegressor(random_state=42)
extra_tree.fit(X_train, y_train)
Then try to calculate the shap values -
# create a explainer object
explainer = shap.Explainer(extra_tree)
explainer.expected_value
array([15981.25812347])
#calculate shap value for a single row
shap_values = explainer.shap_values(pd.DataFrame(X_train.iloc[9274]).T)
This gives me this error -
Exception: Additivity check failed in TreeExplainer! Please ensure the data matrix you passed to the explainer is the same shape that the model was trained on. If your data shape is correct then please report this on GitHub. Consider retrying with the feature_perturbation='interventional' option. This check failed because for one of the samples the sum of the SHAP values was 25687017588058.968750, while the model output was 106205.580000. If this difference is acceptable you can set check_additivity=False to disable this check.
The shape of training and the single row, I passed has the same number of columns
X_train.shape
(421570, 164)
(pd.DataFrame(X_train.iloc[9274]).T).shape
(1, 164)
And I don't think, it should cause any problem. But to make sure, I also tried to bring the right shape using reshape method.
shap_values = explainer.shap_values(X_train.iloc[9274].values.reshape(1, -1))
X_train.iloc[9274].values.reshape(1, -1).shape
(1, 164)
Which also doesn't solve the problem. So, I thought maybe I also need to match the number of rows. So I created a small data frame and try to test it.
train = pd.concat([X_train, y_train], axis="columns")
train_small = train.sample(n=500, random_state=42)
X_train_small = train_small.drop("Weekly_Sales", axis=1).copy()
y_train_small = train_small["Weekly_Sales"].copy()
# train a randomized decision tree model
from sklearn.ensemble import ExtraTreesRegressor
extra_tree_small = ExtraTreesRegressor(random_state=42)
extra_tree_small.fit(X_train_small, y_train_small)
# create a explainer object
explainer = shap.Explainer(extra_tree_small)
shap_values = explainer.shap_values(X_train_small)
# I also tried to add the y value like this
shap_values = explainer.shap_values(X_train_small, y_train_small)
But nothing is working.
One of the people on GitHub suggested uninstalling and reinstall shap's latest version from GitHub -
pip install git+https://github.com/slundberg/shap.git
Also tried it still not working.
Does anyone know how to solve this problem?
from SHAP Exception: Additivity check failed in TreeExplainer
No comments:
Post a Comment