Source: AAAI'20 Conference
Source: https://taiwanenglishnews.com/tesla-on-autopilot-crashes-into-overturned-truck/
Source: W. Samek, A. Binder, MICCAI’18
Source: Google DeepMind beating world champion Lee Se-Dol at GO
"Article 22 of GDPR empowers individuals with the right to demand an explanation of how an AI system made a decision that affects them" - European Commission
"Provide an assessment of the risks posed by the automated decision system to the privacy or security and the risks that contribute to inaccurate, unfair, biased, or discriminatory decisions impacting consumers" - Algorithmic Accountability Act 2019
Black Box model
Global Surrogate model
You have access to the model's structure
$$ IntegratedGrads_{i}(x) ::= (x_{i} - x'_{i})\times\int_{\alpha=0}^1\frac{\partial F(x'+\alpha \times (x - x'))}{\partial x_i}{d\alpha} $$ $$ IntegratedGrads^{approx}_{i}(x)::=(x_{i}-x'_{i})\times\sum_{k=1}^{m}\frac{\partial F(x' + \frac{k}{m}\times(x - x'))}{\partial x_{i} } \times \frac{1}{m} $$
Source: Axiomatic Attribution for Deep Networks - M. Sundararajan, A. Taly, Q. Yan
Author: Andrej Karpathy, generated from ILSVRC 2012 images
from captum.attr import GuidedGradCam
from torchvision import models
# Load pretrained AlexNet
alexnet = models.alexnet(pretrained=True)
alexnet.eval()
# Create object to interpret the model
# To make GradCam work we pass reference to last conv layer
guided_gc = GuidedGradCam(alexnet, alexnet.features[10])
# Predict output on data
out = alexnet(batch)
# Point out classes with highest score
score, index = out.max(1)
# Use interpreter to calculate attributions
attributions = guided_gc.attribute(batch_t, index)
import torch
import torch.nn.functional as F
from captum.insights import AttributionVisualizer
from captum.insights.features import ImageFeature
from torchvision import models
# Load pretrained AlexNet
alexnet = models.alexnet(pretrained=True)
alexnet.eval()
# Launch visualization inside the notebook
visualizer = AttributionVisualizer(
models=[alexnet],
score_func=lambda o: F.softmax(o, 1),
classes=[...], # class labels
features=[
ImageFeature(
"Photo",
baseline_transforms=[...],
input_transforms=[...],
)
],
dataset=...,
)
visualizer.render()
from tf_explain.core.grad_cam import GradCAM
explainer = GradCAM()
output = explainer.explain(*explainer_args)
explainer.save(output, output_dir, output_name)
from tf_explain.callbacks.grad_cam import GradCAMCallback
callbacks = [
GradCAMCallback(
validation_data=(x_val, y_val),
layer_name="activation_1",
class_index=0,
output_dir=output_dir,
)
]
model.fit(x_train, y_train, batch_size=2, epochs=2, callbacks=callbacks)
from lime import lime_image
explainer = lime_image.LimeImageExplainer()
explanation = explainer.explain_instance(
image, # input image
predict, # predict function of interpreted classifier
)
Input image |
Interpretation of dog prediction |
import numpy as np
import shap
model = ... # classifier model
images = ... # image date set
background = images[:100]
test_images = images[100:103]
explainer = shap.DeepExplainer(model, background)
shap_values = e.shap_values(test_images)
shap_numpy = [np.swapaxes(np.swapaxes(s, 1, -1), 1, 2) for s in shap_values]
test_numpy = np.swapaxes(np.swapaxes(test_images.numpy(), 1, -1), 1, 2)
shap.image_plot(shap_numpy, -test_numpy)
Contribution of features to each class prediction
Contribution of features to 2 top predictions
net = ImageClassifier()
saliency = Saliency(net)
# Computes saliency maps for class 3 for the input image.
attribution = saliency.attribute(input, target=3)
# define a perturbation function for the input
def perturb_fn(inputs):
noise = torch.tensor(np.random.normal(0, 0.003, inputs.shape)).float()
return noise, inputs - noise
infidelity(net, perturb_fn, input, attribution)
>> 0.2177
"There's no such thing as a stupid question!"
Kemal Erdem, Piotr Mazurek, Piotr Rarus
Presentation avalibe at: https://tugot17.github.io/XAI-Presentation