The Prisoner’s Dilemma problem provides endless fascination for students of human psychology and has served as the foundation for many studies using game theory. There are endless variations of the basic idea and here is one version of it.
The basic version goes like this. Two criminals are arrested, but police can’t convict either on the primary charge, so they plan to sentence them to a year in jail on a lesser charge. Each of the prisoners, who can’t communicate with each other, are given the option of testifying against their partner. If they testify, and their partner remains silent, the partner gets 3 years and they go free. If they both testify, both get two. If both remain silent, they each get one.
The best result for both people as a group is to cooperate, i.e., not testify against the other, since the total amount of time served in that case is two years, which is less than the total time served by both for any other option. The paradox is that when viewed in terms of each individual selfishly looking after their own interests, whatever the other person does, it seems best for you to defect (i.e., testify). Because if the other person does not testify, then you get off free if you do. If the other person testifies and you don’t, you get three years while if you also testify you get two years. And you know that the other person is figuring out the same thing.
But in real life, people seem to cooperate more than one might expect and the question is why this is so. This research has important implications for understanding the emergence and existence of altruistic behavior in species.
There is an important issue as to whether the game is played just once or sequentially between two players. If it is just once, then defecting seems likely since there is no punishment for doing so. But if you play the game over and over against the same opponent, then since you can punish (or reward) the other person based on their past behavior, it would change the dynamic. There is also the issue of whether the two players are unaware of the other person’s move (or equivalently the moves are made simultaneously) or one person makes a move first and the second person is aware of it (a sequential game).
This has spawned whole research industry involving game theories. Various computer programs have been developed to see what would be the optimal strategy and I don’t want to go into all the literature but for a long time it seemed like the program that usually wins in the simultaneous game is one that adopts a ‘tit-for-tat’ strategy. In this program, in the first case you cooperate but in the next one you do what your opponent did in the previous one. So if your opponent defected on the first move, you defect on your second move. And your third move will correspond to on your opponent’s second move and so on. This strategy suggests that cooperative behavior emerges when we have a memory of how people have behaved towards us in the past.
Now a new study has looked at how real prisoners play the game compared to students.
They expected, building off of game theory and behavioural economic research that show humans are more cooperative than the purely rational model that economists traditionally use, that there would be a fair amount of first-mover cooperation, even in the simultaneous simulation where there’s no way to react to the other player’s decisions.
And even in the sequential game, where you get a higher payoff for betraying a cooperative first mover, a fair amount will still reciprocate.
As for the difference between student and prisoner behaviour, you’d expect that a prison population might be more jaded and distrustful, and therefore more likely to defect.
The results went exactly the other way for the simultaneous game, only 37% of students cooperate. Inmates cooperated 56% of the time.
On a pair basis, only 13% of student pairs managed to get the best mutual outcome and cooperate, whereas 30% of prisoners do.
In the sequential game, way more students (63%) cooperate, so the mutual cooperation rate skyrockets to 39%. For prisoners, it remains about the same.
What’s interesting is that the simultaneous game requires far more blind trust out from both parties, and you don’t have a chance to retaliate or make up for being betrayed later. Yet prisoners are still significantly more cooperative in that scenario.
The prisoners in this case were women. It would be interesting to repeat the experiment with male prisoners.
There was a recent claim of a new Zero Determinant strategy that beats tit-for-tat, implying that selfish behavior beats cooperative behavior. The paper that discusses this is titled Iterated prisoners’ dilemma contains strategies that dominate any evolutionary opponent by W. Press and F. J. Dyson and was published in 2012 in the Proceedings of the National Academy of Sciences. A summary of their argument can be read here but a later critique says that “while ZD strategies are weakly dominant, they are not evolutionarily stable and will instead evolve into less coercive strategies”, i.e., more cooperative players will eventually win out again.
There is a British TV show called Golden Balls based on the Prisoner’s Dilemma in which the two players are strangers. It is a one-off game and they play simultaneously but can communicate with each other prior to making their moves. Here is an episode in which one player was particularly clever in exploiting human psychology.