Outline
What is Explainable AI?
Desiderata of an Explainable AI technique Uses of Explainable AI
Methods for Explainable AI
Activation Maximization
Shapley Values
Taylor Expansions
Layer-wise Relevance Propagation
1/24
What is Explainable AI? Standard machine learning:
The function f is typically considered to be a black-box whose parameters are learned from the data using e.g. gradient descent. The objective to minimize encourages the predictions f (x) to coincide with the ground truth on the training and test data.
x1
x x2 f(x)
xd
Machine learning + Explainable AI:
We do not only look at the outcome f (x) of the prediction but also at the way the prediction is produced by the ML model, e.g. which features are used, how these features are combined, or to what input pattern the model responds the most.
2/24
What is Explainable AI?
Example 1: Synthesize an input pattern that most strongly activates the output of the ML
model associated to a particular class.
Image source: Nguyen et al. (2016) Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks
3/24
What is Explainable AI?
Example 2: Highlight features that have contributed for a given data point to the ML prediction.
Image source: Lapuschkin et al. (2016) Analyzing Classifiers: Fisher Vectors and Deep Neural Networks
heatmap image (bike)
image
heatmap
(person) image
heatmap (cat)
image
heatmap (person)
image
heatmap (train)
heatmap image (train)
image
heatmap (dining table)
4/24
What is Explainable AI?
Example 3: Concept activation vectors (TCAV). Highlight the mid-level concepts that explain,
for a given data point, the ML prediction.
Source: Google Keynote19 (URL: https://www.youtube.com/watch?v=lyRPyRKHO8M&t=2279s)
5/24
Desiderata of an Explanation
In practice, we would like the explanation technique to satisfy a number of properties:
1. Fidelity: The explanation should reflect the quantity being explained and not something else.
2. Understandability: The explanation must be easily understandable by its receiver.
3. Sufficiency: The explanation should provide sufficient information on how the model came up with its prediction.
4. Low Overhead: The explanation should not cause the prediction model to become less accurate or less efficient.
5. Runtime Efficiency: Explanations should be computable in reasonable time.
see also Swartout & Moore (1993), Explanation in Second Generation Expert Systems.
image
heatmap (train)
6/24
Uses of an Explanation Verify (and improve?) a ML model
Verify that the model is based on features which generalize well to examples outside the current data distribution (this cannot be done with standard validation techniques!).
Reliance of the ML models on wrong features is often encountered when there are spurious correlation in the data.
From the explanation, the models trustworthiness can be reevaluated, and the flawed ML model can be potentially retrained based on the user feedback.
7/24
Uses of an Explanation
Example: The classifier is right for the wrong reasons
image heatmap (horse)
average precision of the Fisher Vector model on the PascalVOC dataset
aer 79.08
bic 66.44
bir 45.90
boa 70.88
bot 27.64
bus 69.67
car 80.96
cat 59.92
cha 51.92
cow 47.60
din 58.06
dog 42.28
hor 80.45
mot 69.34
per 85.10
pot 28.62
she 49.58
sof 49.31
tra 82.71
tvm 54.33
In this example, the classifier accurately predicts the horse class, but based on the wrong features (some copyright tag in the corner).
This incorrect decision strategy cannot be detected by just looking at the test error.
cf. Lapuschkin et al. (2019) Unmasking Clever Hans Predictors and Assessing What Machines Really Learn. Nature Communications
8/24
Uses of an Explanation Learn something about the data
(or about the system that produced the data)
Step 1: Train a ML model that predicts well the data.
Step 2: Apply XAI to the trained ML model to produce explanations of the ML decision strategy.
Step 3: Based on the XAI explanations, the user can compare his reasoning with that of the ML model, and can potentially refine his own domain knowledge.
Image source: Thomas et al. (2019) Analyzing Neuroimaging Data Through Recurrent Deep Learning Models
9/24
Part II: Methods of XAI
Presented methods
Activation maximization
Shapley values
Taylor expansion
Layer-wise relevance propagation
Other methods
Surrogate models (LIME)
Integrated gradients / expected gradients / SmoothGrad Influence functions
10/24
Activation Maximization
Assume a trained a ML model (e.g. a neural network), and we would like to understand what concept is associated to some particular output neuron of the ML model, e.g. the output neuron that codes for the class cat. Activation maximization proceeds in two steps:
Step 1: Think of the ML model as a function of the input
Step 2: Explain the function f by generating a maximally activating input pattern:
x =argmaxf(x,) x
11/24
Activation Maximization
Problem: In most cases f (x) does not have single point
corresponding to the maximum.
E.g. in linear models, f (x) = wx + b, we can keep moving the point x further along the direction w, and the output continues to grow).
Therefore, we would like to apply a preference for regular regions of the input domain, i.e.
x =argmaxf(x)+(x) x
In practice, the preference can be for data points with small norm (i.e. we set (x) = x2 so that points with large norm are penalized.)
12/24
Activation Maximization: Examples
f (x) = wx + b and (x) = x2 f (x) = max(x1, x2) and (x) = x2
13/24
Activation Maximization: Probability View
Assume the model produces a log-probability for class c :
f (x) = log p(c |x)
The input x that maximizes this function can be inter- preted as the point where the classifier is the most sure about class c.
Choose the regularizer (x) = logp(x), i.e. favor points that are likely.
The optimization problem becomes:
x =argmax logp(c|x)+logp(x)
x
=argmax logp(x|c) x
where x can now be interpreted as the most typical input for class c.
14/24
Attribution of a Prediction to Input Features
aribuon
input
ML blackbox
predicon
1. ThedataxRd isfedtotheMLblack-boxandwegetapredictionf(x)R. 2. We explain the prediction by determining the contribution of each input feature.
Key property of an explanation: conservation (di=1 i = f (x)).
15/24
Attribution: Shapley Values
Framework originally proposed in the context of game theory (Shapley 1951) for assigning payoffs in a cooperative game, and recently applied to ML models.
Each input variable is viewed as a player, and the function output as the profit realized by the cooperating players.
TheShapleyvalues1,,d measuringthecontributionofeachfeatureare: i = |S|!(d|S|1)!f(xS{i})f(xS)
where (xS)S are all possible subsets of features contained in the input x.
d! S : i / S
16/24
Attribution: Shapley Values
Recall: i = |S|!(d|S|1)! f (xS{i}) f (xS) d!
S: i/S S S
Worked-through example: Consider the function f (x) = x1 (x2 + x3). Calculate the contri- bution of each feature to the prediction f (1) = 1 (1 + 1) = 2.
17/24
Attribution: Taylor Expansions
Many ML models f (x) are complex and nonlinear when taken globally but are simple and linear when taken locally.
The function can be approximated locally by some Taylor expansion: f(x)=f(x)+d [f(x)]i (xi xi)+
First-order terms i of the expansion can serve as an explanation.
The explanation (i )i depends on the choice of root point x.
i=1 i
18/24
Attribution: Taylor Expansions
Example: Attribute the prediction f (x) = wx with x Rd on the d input features.
19/24
Attribution: Taylor Expansions Limitations: Gradient information is too local-
ized.
Cannot handle saturation effects and discontinuities e.g. cannot explain the function
f(x)=d xi max(0,xi ) i=1
at the point x = (2, 2).
This limitation can be overcome by looking at the structure of the model and decompose the problem of explanation in multiple parts ( next slide).
20/24
Attribution: Look at the Structure of the Model Observation:
wij a vj
The function implemented by a ML model is typically a composition of simple elementary functions.
These functions are simpler to analyze than the whole input-output function.
Idea:
x1 x2
1 1
3 1 a4 -1
Treat the problem of explanation as propagating the prediction backward in the input-output graph.
The layer-wise relevance propagation (LRP) method implements this approach and can be used to explain ML models ( next slide).
-1 1
1 a5
yout
1 -1 -1 a6
21/24
Attribution: The LRP Method
Example: Consider yout to be the quantity to explain: Rout yout
wij a vj 131
Step 1: Propagate on the hidden layer 6 ajvj
x1
1 a4 -1
-1 1
-1
j=3 : Rj 6j=3 ajvj Rout Step 2: Propagate on the first layer
x 1 a5 2 1
-1 a6
yout
2 6 xiwij
i=1 : Ri j=3 2i=1 xiwij Rj
Note: Other propagation rules can be engineered, and choosing appropriate propagation rules is important to ensure LRP works well for practical neural network architectures.
22/24
Attribution: The LRP Method
Effect of LRP rules on the explanation (e.g. class castle predicted by a VGG-16 neural network.)
VGG-16 Network
LRP-0 LRP- LRP-
LRP-
LRP- LRP-0
23/24
33 @ 64
33 @ 64
33 @ 128 33 @ 128
33 @ 256 33 @ 256 33 @ 256
33 @ 512 33 @ 512 33 @ 512
33 @ 512 33 @ 512 33 @ 512
71 @ 4096 11 @ 4096 11 @ 1000
Summary
Explainable AI is an important addition to classical ML models (e.g. for validating a ML model or extracting knowledge from it).
Many XAI methods have been developed, each of them, with their strengths and limitations:
Activation maximization can be used to understand what a ML model has learned, but is unsuitable for explaining an individual prediction f (x).
Shapley value has strong theoretical foundations, but is computationally unfeasible for high-dimensional input data.
Taylor expansions are simple and theoretically founded for simple models, but the expansion does not extrapolate well in complex nonlinear models.
LRP leverages the structure of the ML model to handle nonlinear decision functions, but requires to carefully choose the propagation rules.
24/24
Reviews
There are no reviews yet.