SHAP demystified: understand what Shapley values are and how they work

From game theory to machine learning

Some machine learning (ML) models, after being deployed, shall directly affect the lives of billions of people in the most diverse situations, from quickly triaging medical images in a crowded hospital to influencing how you spend your hard-earned money online. However, even though they are used routinely, a lot of the models are still seen as black boxes.

The question, then, is: if we do not truly understand what’s going on under the hood for these ML models, can we (or should we) even trust them?

The area of interpretability and explainability already offers us some practical tools that strive to shed light on these black-box models. In a previous post, we introduced LIME (local interpretable model-agnostic explanations), which produces explanations for a particular model’s prediction. As discussed, post hoc explanations (like the ones provided by LIME) are extremely useful. They offer us a way to escape the trade-off between interpretability and predictive performance, which is pervasive in the field of ML.

LIME was a revolutionary technique that transformed the landscape of model interpretability, but it is not the only available method. Another extremely popular and powerful method is SHAP (which is short for SHapley Additive exPlanations).

SHAP is built over the idea of Shapley values, a result from game theory obtained in the 50s. At first sight, the connections between cooperative games (which is the context in which Shapley values emerge) and ML explainability are not completely clear. However, there are strong links between the two, which allows us to progress towards understanding ML models’ predictions.

In this post, we start by exploring an example that illustrates an application of Shapley values in its original context. Then, we draw the parallels between game theory and ML explainability, this time with an example from ML. In the end, we present some of the implementation obstacles and point to connections between LIME and SHAP.

Learn the secrets of performant and explainable ML

Be the first to know by subscribing to the blog. You will be notified whenever there is a new post.

And hey -- if you are already familiar with SHAP and want to start using it to understand your models, feel free to head straight to Openlayer!

What are Shapley values

Let’s start with an example that has nothing to do with ML to illustrate the kinds of problems Shapley values solve. At first, this might seem like a pointless digression but hang in there, this is a great exercise to develop intuition. We will expand on the example provided by Ian Covert, in his fantastic blog post here.

Imagine you work in a big company, with many employees (let’s call the number of employees d). You all have worked very hard the whole year and, as a consequence, the company made $1,000,000 in profit. As a reward, management decides to distribute the whole profit as end-of-year bonuses to employees.

Source (modified): https://iancovert.com/blog/understanding-shap-sage/

Now, it’s time to start thinking about how to split all that money.

The $1,000,000 profit is a result of the work of all the employees, but would it be fair if management simply split it equally?

Probably not. Some people might deserve a greater share because they made key strategic decisions that contributed directly to the current profit and worked really hard. On the other hand, some people might be entitled to less, since they were basically “hanging around” telling jokes while standing close to the water fountain all day long.

A fair profit split should take into account the contributions made by each employee in different scenarios. To assess it, management came up with a very powerful function, called v. It is a function that is capable of telling them exactly what the company’s profit would be if there was any subset of employees working there.

Mathematically, let D = {1, 2, ..., d}, be the set of all d employees, S be a subset of D representing the employees working at the company. So:

  • if all employees were working, we would have the initial situation we described, i.e.,v(D) = 1,000,000;
  • if employee #1 is the CEO and they were the only one not working, let’s say that the company’s profit would be half as much. In this case, S = {2, 3, ..., d} and v(S) = 500,000;
  • if everyone was working, except for a couple of the new hires, the company’s profit wouldn’t change that much, so for S = {1, 2, ..., d-2}, v(S) = 950,000;
  • if no one was working, obviously the company’s profit would be equal to zero, i.e., S = {} , v(S) = 0.

The above examples are just meant to illustrate how powerful the function v can be. In fact, management can evaluate any of the possible scenarios (i.e., any of the combination of employees working and the profit in that case).

Ok, but how does v solve management’s problems?

Well, aided by v, management can calculate each employee’s Shapley values, which represent exactly how much of the profit each employee should receive. It turns out that employee i should receive phi_i as an end-of-year bonus, where phi_i is given by:

ϕi=1dSD{i}(d1S)1(v(S{i})v(S)),\phi_i = \frac{1}{d}\sum_{S \subseteq D\setminus\{i\}} {d-1 \choose |S|}^{-1} (v(S\cup \{i\}) - v(S)),

where |S| represents the cardinality of set S, i.e., the number of elements in the set.

The above formula might seem scary at first but worry not. The Shapley value for employee i is basically a weighted average of the marginal contributions it made to the profit in different scenarios. Notice that

(v(S{i})v(S))(v(S\cup \{i\}) - v(S))

is nothing more than the bump in profit due to employee i in the scenario where employees in the subset S are working. The summation is done over all of the possible 2^(d-1) subsets of D without employee i (so that it is possible to measure how much employee i added in profit in these cases) and the weights account to the different orderings (because the employee ordering doesn’t matter).

By computing phi_i for every employee, management has a fair way to distribute the $1,000,000 profits!

Source (modified): https://iancovert.com/blog/understanding-shap-sage/

In his seminal paper in 1953, Lloyd Shapley showed that Shapley values computed in the way shown above are the only possible solution that respects a set of desirable properties. The discussion as to why this set of properties is desirable to achieve a fair distribution can be consulted in Ian’s blog post.

SHAP values in ML

For game theory, the definition of a “game” is a bit looser than the one we use daily and a lot of things can be seen as a game.

The profit distribution problem we explored in the previous section is an instance of a game: the employees are the players and they cooperate to generate a payoff, a reward, that shall be distributed between them (the profit).

Now, you might be wondering: what does any of this have to do with ML?

Interestingly, it is possible to draw clear a connection between the problem that Shapley values solve, in cooperative game theory, and the problem of explainability in machine learning.

The input features for a given model are seen as the players. These players cooperate to generate the model’s output, which is seen as the payoff.

This is how Shapley values apply to ML, giving rise to SHAP. The idea is that for a black-box ML model f that makes predictions based on a set of d features, it is possible to understand the influence each one of them had on the model’s predictions.

By doing so, we end up with the SHAP scores, phi_1, ..., phi_d. Due to one of the theoretical properties (often called local accuracy, in the ML community), if we add up all of the SHAP scores for a particular model’s prediction, we end up with exactly the model’s output. This is something trivial if we think about the profit distribution problem from the previous section, after all, summing up all the bonuses should result in the company’s profit, but less so for ML.

To make matters more concrete, let’s explore a practical example.

Let’s say we have a platform with paying users. Furthermore, we built a model that predicts whether a user will churn (i.e., stop using our service) or not based on a set of features, such as Age, Geography, CreditScore, among others. With such a model in hands, we can try to identify which users are likely to churn and take actions to avoid it.

The problem is that our model predicted some users were going to churn, when in fact, they didn’t. This made our organization waste precious resources trying to retain users that were not going to churn in the first place.

Let’s upload the trained model and the validation set to Openlayer and explore the model’s predictions with SHAP!

First, we can filter our validation set and look only at the data from the specific error class we are interested in, namely, samples for which the model predicted Exit, but which the label was Retained.

Done! Now the data displayed comes only from the error class we are interested in. When we click on a particular row, we auto-magically get the explanation for that prediction. Let’s click one of the rows shown to get the explanations with SHAP.

The colors and small numbers next to each feature are the SHAP values. As a side-note, as it can be seen on the upper left corner of the image above (where it says “Show LIME values”), LIME is also available. In fact, that’s what we used in a previous post.

As we can see, the Age was the feature that contributed the most to the model’s misprediction on this particular sample. It’s probably because the Age is high. The platform might not have many 92-year old users, that’s why our model thought this user specifically would churn.

Use the what-if analysis to see how our model would behave. We can simply type another value for the feature Age, right below “Comparison value”. Let’s change the age to 30 and leave the remaining features as they were, to see if the model would predict correctly in this case.

Yup! Now our model predicted the user would be retained and the Age is strongly nudging the model’s prediction in the right direction! Moreover, notice that the other features received different scores as well because we are essentially explaining a new point.

Challenges with SHAP in practice

If you are still skeptical about the feasibility of the practical implementation of SHAP, you’re right, you should be. Two gaps need to be addressed to allow the use Shapley values in the context of ML.

First, computing the Shapley values in the way that we presented would require the ML models to behave the way the powerful function from management did. We should be able to evaluate the model’s output for any subset of input features.

In the churn classifier example, this would mean evaluating the model’s prediction if we knew only the CreditScore; then only the CreditScore and the Geography; then only the Geography; and all other combinations.

The problem is that if we train a model using all the features, we are not able, in theory, to evaluate it using only a subset of them.

The second problem is the summation we have to perform to compute the weighted average. If we have d features, we need to sum over 2^(d-1) terms to compute each feature’s scores.

In the churn classifier from the previous example, where there were 10 features, this is equal to 512, which is large, but not problematic at all for today’s computers.

What if we had 400 features? What about 1000?

Notice that we have a problem that explodes combinatorially and can very quickly become prohibitive in practical situations.

Fortunately, the practical implementations of SHAP contain some clever workarounds for both of these issues. We won’t go into details of each in this post, to avoid getting overly technical. Let’s limit ourselves to saying that SHAP overcomes the first issue by using conditional expectations and assuming feature independence and it overcomes the second issue by cleverly using sampling.


Even though it might not seem like it at first, LIME (which we presented in a previous blog post) and SHAP are very, very connected. In fact, they are both solutions to the same optimization problem, but with slightly different choices made along the way.

This is one of the key results shown in the paper “A Unified Approach to Interpreting Model Predictions”, by Scott Lundberg and Su-In Lee, which draws clear connections between various explainability methods, bundling all of them into the category of additive feature attribution methods.

Again, we leave the details for a future blog post, and meanwhile, the interested reader should check out the above paper for details.

* A previous version of this article listed the company name as Unbox, which has since been rebranded to Openlayer.

Recommended posts

Error analysis

Error analysis in machine learning: going beyond predictive performance

Going beyond predictive performance

Gustavo Cid

July 10th, 2023 • 7 minute read

Error analysis in machine learning: going beyond predictive performance
Model quality
Error analysis

A beginner’s guide to evaluating machine learning models beyond aggregate metrics

A high accuracy is not enough

Gustavo Cid

May 23rd, 2023 • 5 minute read

A beginner’s guide to evaluating machine learning models beyond aggregate metrics
Machine learning

How LIME works | Understanding in 5 steps

Leveraging how LIME works to build trustworthy ML

Gustavo Cid

May 16th, 2023 • 7 minute read

How LIME works | Understanding in 5 steps