I find the concept of luck vs skill in games to be fascinating, because the common intuitions are just so wrong. The common intuition is that some games involve more luck, and some games involve more skill. On the extreme end of luck, we have the lottery; on the extreme end of skill, we have chess. The orthodox view was best expressed by a Vox article/video, which included the following image:
The Vox image also shows several sports, and the position of each sport is based on the statistical analysis of Michael Mauboussin. The details of analysis aren’t explicitly described, but it’s basically analyzing the national tournaments for each sport, and estimating how much of the variance in outcome is explained by luck or by skill.
Mauboussin did not analyze chess. Vox added chess in themselves, pulling a claim out of their ass. Without doing any analysis, I can guarantee that if you applied the same statistical analysis to chess, you would not find that chess was 100% skill. The analysis will only show that a game is pure skill if the same people consistently win all their games. I quickly checked the US Chess Championship winners, and while some names show up repeatedly, it is not 100% consistent, and therefore would not be deemed a pure skill game by this analysis.
So what gives? Is the statistical analysis bogus, or is the claim that chess is 100% skill bogus? Trick question. Both of them are bogus.
Proposition 1: Explanation of variance is a poor way to model skill vs luck.
Let’s step back and think about this philosophically. Skill and luck are concepts that we have prior to any knowledge of statistics. When people use statistics to create measures of luck and skill, that’s just a mathematical model for a preexisting idea. Sometimes a model can show us something surprising and counterintuitive. But sometimes a model is just failing to accurately represent the thing it’s supposed to represent.
One of the surprising conclusions of Mauboussin’s statistical model, is that luck and skill are not properties of the game itself, but properties of the tournament and its players.1 Suppose I created a tournament where I play basketball against one of the professionals–I would just lose every single game. The analysis would show that the outcomes are 100% explained the skill difference between us. And now suppose we created a tournament where two equally skilled players play games against each other. By “equally skilled”, I mean that in the matchup each one wins 50% of the time. The statistical analysis would show that the outcomes are 100% explained by chance. After all, a difference in skill cannot explain anything when the difference in skill is zero.
Which brings me to the second surprising conclusion: a game between two equally skilled players is a game without skill?
We have two options: bite the bullet and accept these counterintuitive conclusions, or else admit that the model is inaccurate or incomplete.
…Or we can take Vox’s route, uncritically accepting the model, while contradicting ourselves by describing chess as a game of pure skill. Within Maubboussin’s model, it is not merely wrong to claim that chess (detached from any tournament) is pure skill, it is nonsensical.
Proposition 2: Luck and skill are distinct properties.
I earnestly believe that one of the world’s premier philosophers on luck vs skill in games also happens to be the creator of Magic: The Gathering. Richard Garfield has talked about luck vs skill in many places, most notably an article in 1997 (!) and a talk in 2012 (and here’s another talk with better audio). The main point he makes is that luck and skill are two distinct axes that can be used to describe a game. You can distinguish between high luck high skill games from low luck low skill games.
Garfield describes a thought-experiment-game called Rando-chess. In Rando-chess, you play a game of chess, and then you roll a die. If you roll a 1, then the person who lost the chess game wins, otherwise the winner of the chess game wins. This obviously involves more luck than chess, but does it involve less skill? The strategy of rando-chess is identical to that of chess, and it takes an identical amount of time and effort to master. One’s skill may have less impact on who wins or loses, but it seems inaccurate to say that the game involves less skill.
In fact, it may be best to detach the concept of skill from the concept of winning entirely. Take a game like Minecraft, which has no end point and no winners, but still makes use of skills. Or take a non-game like mathematics, which certainly involves skills. Skill is an independent entity, and winning is a reward you get in the context of some kinds of games.
Proposition 3: Chess involves luck.
Garfield describes another thought-experiment-game, where you have to choose between two identical doors, and one of them is the winning door. Obviously this is a game of luck. But what if there’s a complex puzzle that tells you the correct door, and the puzzle is simply beyond your understanding? It’s still a game of luck, because a puzzle beyond your understanding simply doesn’t help you.
Chess may be deterministic, but essentially it’s a puzzle beyond all understanding. And you don’t ignore the puzzle, but neither can you fully solve it, not even with a computer. The best you can do is assess which moves are likely to be winning. But you simply don’t know for sure, so it’s still partway like the game where you have to select the correct door.
Random number generation (e.g. with dice or cards or computers) is not the only form of luck. Luck is defined relative to your personal knowledge. If a computer has a deterministic method for choosing the next random number, but you don’t understand the method, then it’s still random from your perspective. If every chess position has deterministic winning and losing moves, but you can’t distinguish between them with certainty, then it’s still random from your perspective.
Proposition 4: The introduction of luck often enhances the level of skill, rather than diminishing it.
I believe that generally speaking, when you introduce an element of luck into a game, that makes it harder to master. I have three arguments:
1. Rather than considering the single outcome for each possible move, you have to consider the collection of possible outcomes, and the likelihood of each.
2. Improving at a game often involves assessing whether you played well or poorly. If luck is involved, then it’s harder to tell when you played well, because the outcome might have been the result of good plays, or it might have been the result of good luck.
3. The more luck in a game, the more likely players are to find themselves in a variety of game states. And so players must master the game from a wide range of positions, both winning and losing.
Of course, these arguments don’t apply to every situation (try applying to Rando-chess).
Proposition 5: A better model could produce better insight about skill in games.
Although I’ve argued that “skill” is a distinct concept from “the degree to which a game rewards skill”, both of these are still useful concepts that we might like to model. I think a good starting point is something like the Elo rating system created for chess. In this model, each player has a certain mean performance, and variance in performance. Performance can’t be measured directly, but if one player beats another, then we know that that player had a higher performance in that particular game.
The mean performance might be considered the “skill factor” of the game, and the variance in performance is the “luck factor”. But the Elo rating system can’t provide an absolute measure of the luck factor, because “performance” is scaled in such a way that the variance in performance is constant.2 At best, the model describes the ratio of skill to luck.
Now, if you have a tournament, you’ll have a certain variance in skill among the players, but it should be clear that this is a property of the players, and not a property of the game. If we want to describe the properties of the game, I think we need a function that estimates the mean performance of someone at a certain skill level (where skill is a combination of talent and practice). I’ll call this function M(skill).
There are two general properties that I might use to describe M(skill). The first property is the derivative, which describes how much you’re rewarded for a certain increase in skill, where the unit of reward is increased probability of victory. Note that M(skill) might have a different derivative at different levels of skill. The second property, is how much you can increase your skill before M(skill) levels off. This is basically the depth of the game, how much time you can dedicate to it before the rewards dry up.
So this model provides a way to think about the “depth” of the game, and the degree to which the game rewards skill. What the model does not do, is provide a way to measure how much skill is in the game. Is that because the model is incomplete, or is it because “How much skill is in the game?” is an incoherent question? I leave it for the reader to judge.
1. There’s an analogy to be made to a concept in evolutionary biology, known as heritability. Heritability is a statistical measure of how much variance in a biological trait is explained by variance in genes. Counterintuitively, heritability is not a property of the trait itself, but is a property of a specific population. In a hypothetical population where everyone is genetically identical, then variance in genes would explain none of the variance in biological traits. (return)
2. Given two players, the expected win rate of the first player is
where M1 is the mean performance of the first player and M2 is the mean performance of the second player. This implies that the standard deviation of performance per player per game is 223, if I did my math correctly. (return)