Real prisoners and the Prisoner’s Dilemma

The Prisoner’s Dilemma problem provides endless fascination for students of human psychology and has served as the foundation for many studies using game theory. There are endless variations of the basic idea and here is one version of it.

The basic version goes like this. Two criminals are arrested, but police can’t convict either on the primary charge, so they plan to sentence them to a year in jail on a lesser charge. Each of the prisoners, who can’t communicate with each other, are given the option of testifying against their partner. If they testify, and their partner remains silent, the partner gets 3 years and they go free. If they both testify, both get two. If both remain silent, they each get one.

The best result for both people as a group is to cooperate, i.e., not testify against the other, since the total amount of time served in that case is two years, which is less than the total time served by both for any other option. The paradox is that when viewed in terms of each individual selfishly looking after their own interests, whatever the other person does, it seems best for you to defect (i.e., testify). Because if the other person does not testify, then you get off free if you do. If the other person testifies and you don’t, you get three years while if you also testify you get two years. And you know that the other person is figuring out the same thing.

But in real life, people seem to cooperate more than one might expect and the question is why this is so. This research has important implications for understanding the emergence and existence of altruistic behavior in species.

There is an important issue as to whether the game is played just once or sequentially between two players. If it is just once, then defecting seems likely since there is no punishment for doing so. But if you play the game over and over against the same opponent, then since you can punish (or reward) the other person based on their past behavior, it would change the dynamic. There is also the issue of whether the two players are unaware of the other person’s move (or equivalently the moves are made simultaneously) or one person makes a move first and the second person is aware of it (a sequential game).

This has spawned whole research industry involving game theories. Various computer programs have been developed to see what would be the optimal strategy and I don’t want to go into all the literature but for a long time it seemed like the program that usually wins in the simultaneous game is one that adopts a ‘tit-for-tat’ strategy. In this program, in the first case you cooperate but in the next one you do what your opponent did in the previous one. So if your opponent defected on the first move, you defect on your second move. And your third move will correspond to on your opponent’s second move and so on. This strategy suggests that cooperative behavior emerges when we have a memory of how people have behaved towards us in the past.

Now a new study has looked at how real prisoners play the game compared to students.

They expected, building off of game theory and behavioural economic research that show humans are more cooperative than the purely rational model that economists traditionally use, that there would be a fair amount of first-mover cooperation, even in the simultaneous simulation where there’s no way to react to the other player’s decisions.

And even in the sequential game, where you get a higher payoff for betraying a cooperative first mover, a fair amount will still reciprocate.

As for the difference between student and prisoner behaviour, you’d expect that a prison population might be more jaded and distrustful, and therefore more likely to defect.

The results went exactly the other way for the simultaneous game, only 37% of students cooperate. Inmates cooperated 56% of the time.

On a pair basis, only 13% of student pairs managed to get the best mutual outcome and cooperate, whereas 30% of prisoners do.

In the sequential game, way more students (63%) cooperate, so the mutual cooperation rate skyrockets to 39%. For prisoners, it remains about the same.

What’s interesting is that the simultaneous game requires far more blind trust out from both parties, and you don’t have a chance to retaliate or make up for being betrayed later. Yet prisoners are still significantly more cooperative in that scenario.

The prisoners in this case were women. It would be interesting to repeat the experiment with male prisoners.

There was a recent claim of a new Zero Determinant strategy that beats tit-for-tat, implying that selfish behavior beats cooperative behavior. The paper that discusses this is titled Iterated prisoners’ dilemma contains strategies that dominate any evolutionary opponent by W. Press and F. J. Dyson and was published in 2012 in the Proceedings of the National Academy of Sciences. A summary of their argument can be read here but a later critique says that “while ZD strategies are weakly dominant, they are not evolutionarily stable and will instead evolve into less coercive strategies”, i.e., more cooperative players will eventually win out again.

There is a British TV show called Golden Balls based on the Prisoner’s Dilemma in which the two players are strangers. It is a one-off game and they play simultaneously but can communicate with each other prior to making their moves. Here is an episode in which one player was particularly clever in exploiting human psychology.

Comments

machintelligence says

July 25, 2013 at 9:58 am

Inmates cooperated 56% of the time.

So there is such a thing as honor among thieves!
Dunc says

July 25, 2013 at 10:02 am

I’m not surprised… Whether there is truly honour amongst thieves is debatable, but if you’re looking for a population who are unlikely to co-operate with the police, you couldn’t do much better. On the one hand, becoming known as a “grass” is both extremely harmful to your future prospects in the criminal fraternity and highly dangerous for your immediate health, and on the other, these are people who are better placed than most to know exactly how trustworthy the police and courts are when it comes to honouring deals (i.e. not very).
One Brow says

July 25, 2013 at 10:49 am

There was also an American TV show on the Game Show Network, called Friend or Foe?, that used a Prisoner’s Dilemma variation in the final round.
sigurd jorsalfar says

July 25, 2013 at 11:45 am

@2 Agreed. The answer may well be that the punishment doled out within the criminal community for ratting is perceived as far worse than what’s being doled out by the justice system. So you keep your mouth shut, knowing that the other prisoner will likely do the same thing for the same reason.
Dunc says

July 25, 2013 at 12:09 pm

Of course, the fact that grassing gets you beaten up or killed is simply the collective outcome of the iterated Prisoner’s Dilemma in an entire community. Having realised that it’s better for everyone if nobody defects, the community then institutes measures to punish those who do. It’s actually a great example of socially emergent behaviour in action.
machintelligence says

July 25, 2013 at 12:33 pm

I hadn’t thought of that, but it certainly makes sense.
Jeffrey Johnson says

July 25, 2013 at 8:40 pm

Exactly. Anyone who has had a lot of contact with people involved with illegal drugs could have predicted this. Avoiding arrest and hostility toward the police are part of that lifestyle, and there is a pretty firm code of ethics that you do not ever call the police, and you do not ever talk to the police. When someone snitches, people find out because they check court records and police reports. Whoever gets caught snitching will eventually pay the price.

So I’m surprised it was only 56%.
Paul Jarc says

July 26, 2013 at 11:38 am

Note that the game theory formalization doesn’t quite line up with human psychology for the usual description of the dilemma, since it assumes we don’t care about others’ outcomes, when in fact we do. So there is a different visualization that avoids that problem. LessWrong also had a recent Prisoner’s Dilemma tournament for programs that must make their decisions simultaneously, but each gets to examine the other’s source code before deciding.

Mano Singham

Just another Freethought Blogs site

The Cult Of LLMs

Count Binface is on a roll in the UK

Environmental inanity -- it's a good reason to throw this guy out of office NOW

Ponder This

Blame Canada!

The Probability Broach: Dead men tell no tales

The Bolingbrook Babbler interviews the ghost of Lindsey Graham (Fiction)

Space is full of sugar!

More Time Zone Silliness

The efficient fencing problem

Comments

Leave a Reply Cancel reply

Share this:

Comments

Leave a Reply Cancel reply