Prisoner's Dilemma

Life is a Prisoner’s Dilemma

Game Theory


Two criminal partners are arrested and put in prison. They are put in separate cells, so they are unable to communicate with their partner. The sentence for their crime is 10 years. The problem is the police don’t have any enough evidence for the conviction. They do however have enough evidence for a smaller crime with a 2 year sentence. The police give each prisoner an option to confess to the crime.

If one of them confesses, betraying their partner, they are free to got from all crimes but their partner serves the full 10 year sentence. Instead if they both confess, then the split the sentence and each get 5 years in prison. Lastly if they both choose to stay silent, then they are both charged with the smaller crime and each serve 2 years. Do you cooperate with your partner and stay silent or do you defect?

This is the prisoner’s dilemma. It is one of the most popular game theory problems ever studied.

Prisoner’s Dilemma Game

In this scenario both players are rational and fully understand the rules of the game. Based on the given information, both prisoner’s best choice is to confess. This is because no matter what the other person choses to do your best outcome occurs when you defect.

If your partner chooses to stay silent, you can either choose to confess for 0 years or stay silent for a 2 year sentence. So you defect. If your partner chooses to confess, you can either choose to confess for 5 years or stay silent for 10 years. So you defect. Regardless of what your partner chooses your best choice is to defect. Even though that means both prisoner’s would receive a 5 year sentence. But if they both stayed silent then they each only receive a 2 year sentence.

Nash Equilibrium

Mutual defection like this is called the strong Nash equilibrium. The Nash equilibrium a common solution to non-cooperative games involving two or more players. It is defined when no player has anything to gain by changing their strategy. In this case, neither prisoner is better off switching from confessing to staying silent.

A strong Nash equilibrium means that if the players cooperate they can deviate their choice that benefits all players. In this case, if both prisoner’s deviate to staying silent they reduce their sentence to only 2 years each.

Therefore our solution to this dilemma is mutual defection because it is the strictly dominate strategy.

Iterated Prisoner’s Dilemma

This game is just one iteration. But what happens when we play this game multiple times in a row? This is called the iterated prisoner’s dilemma. It is the same game with the same rules but you play the games in succession with memory of past games played. What this means is that you remember the players previous choices in the game.

The dominate strategy remains to defect all rounds if the players know how many rounds the game will be played. This is due backwards induction. On the last round of the game the best strategy is to defect because in a one round game that is the strictly dominate choice. Therefore on the second to last round the players will defect to try to get an edge. This is repeated all the way to the first round of the game where the both players chose to defect.

Alternative Strategies

Things change when the players don’t know when the final iteration of the game is going to occur. When players are unaware of the number of rounds in the game it allows for cooperation to emerge. Players are able to sustain cooperative outcome because they receive greater utility than if they were to defect. But it only works because player are unaware if they will have to play another round with this person again.

This game with an unknown number of rounds had led the creation of different strategies. Like tit-for-tat and grim strategy. All these different strategies are designed to try and optimize your total utility at the end of the game.

Life Strategies

Many people are unaware that we are all playing multiple unknown iterations of the prisoner’s dilemma games all the time. You play this game in sports, in biology, in everyday life. It is the same idea that there are three types of people: wolves, sheep, and sheepdogs. Wolves are the ones who always defect as soon as they get the chance. Sheep are the ones who always cooperate. And sheepdogs are the people who follow different strategies like tit for tat. They will cooperate until you defect, then they will defect.

Sports

In sports using performance enhancing drugs has become quite common. To the point where if you want to compete it is almost required to do so. This is strong Nash equilibrium. Both players defect and choose to use PEDs. If one stays clean while the other uses PEDs, the PEDs user has a significant advantage. If both stay clean then both are on a equal playing field and don’t get the dangerous side effects of using.

In the long run both parties are better off if they stay clean. But because of the multiple iterations of prisoner’s dilemma games being played they choose to defect. Although the multiple iterations does not increase the how often players defect a larger reason is because these players aren’t just playing the game against one other person but tens or hundreds or thousands of players (depends on how many players there are in the sport). And you need them all to cooperate to get the utility of cooperation. So the more people who play the prisoner’s dilemma game the less likely cooperation will be the outcome.

This is where organizations like United States of Anti-Doping Agency (USADA) comes in. Basically this organization was designed to increase the costs of defecting. And in turn increase cooperation. To what degree is effective? That’s a different story. Drug testing and using is very complex and there are numerous loopholes to get around testing positive. But nonetheless it does increase the cost of defecting even if it is a very slim margin.

Read about why some people think random drug testing is not random at all.

Darwinism

Darwinism – Survival of the Fittest. We all learned this in school. Organisms best adjusted to their environment are the most successful in surviving and reproducing. This is exactly the same principle as the prisoner’s dilemma.

Just surface level analysis of Darwin’s study: the defect strategy are the animals that can find and consume the most food as well as easily reproduce while not sharing their wealth. Wealth in terms of “best adjusted to their environment” not what’s in their 401(k). They don’t see a benefit in cooperating because if the other animal they are playing against doesn’t cooperate then they lose the game. In this case perish from lack of food or don’t reproduce. Therefore the Nash equilibrium is for the animals to defect and thus we get survival of the fittest.

He Was Wrong?

Many believed that Darwin was wrong that all beings fight for resources and fight to reproduce. They thought this because trees, for example, share resources and help each other survive. This is one of the reasons forests are dense and full of life. They aren’t competing with one another but instead sharing. Some claimed that this went against Darwin’s theory/findings. But it doesn’t. It is actually just the other outcome of the prisoner’s dilemma game that we all play.

Darwin staid that organisms that best adjust to their environment are the most successful in surviving and reproducing. It has nothing to do with fighting for resources. That is the game. How do I give myself the best chance of survival? I could defect: fight other organisms for resources. Or I cooperate: I share any excess resources with other organisms and hopefully one day when I need resources they share their excess with me.

Both of these outcomes occur in real life. This just proves that the prisoner’s dilemma isn’t just a game but is instead biologically programmed into our DNA.

Survivor

Now I think the best place that shows how the prisoner’s dilemma game is present everyday in our life is the reality TV game show: Survivor.

In this game players are put on an island and placed in different tribes. They compete in challenges for resources and immunity in the game. The tribe who performs the worst in the challenges must vote someone from their tribe out of the game. Then the tribes eventually get merged into one and they repeat this: challenge – vote out process until there are 3 players left. Then the players who have been voted off the game vote for who they want to win the game out of the remaining 3 players. They are then crowned the title of sole survivor and win a bunch of money.

If that didn’t make any sense, don’t worry it is a lot more complex than that. Just watch the show, it is really good.

You can read more on my analysis on what factors effect who wins the game of survivor here.

Anyways, through out the game players create alliances. These alliances are made up of the people who you work together with and vote the people who are not in your alliance out of the game. When the people in the alliance work together this is equivalent of both players cooperating in the prisoner’s dilemma. But sometimes a player flips on their alliance and votes against their alliance. This is equivalent to defecting.

Since there can only be one Sole Survivor at the end of the game all alliances eventually break apart due to players defecting. This is expected because the most optimal outcome for players is to cooperate then defect right before the other player defects. So what ends up happening is players cooperate in the beginning of the season and as the season progresses more players defect against their alliance. The best strategy seems to be players convince their alliance that they will cooperate but then actually defect so their alliance wouldn’t suspect it coming.

Real Life

Survivor it just a hyperbole of real life. The clear example of the prisoner’s dilemma that they play in the game is exactly what we play with all of our relationships. It may not seem obvious though, this is because the vast majority of us has always cooperated and will always do so.

Family and Friends

For example most people will always cooperate with their family. They know that this relationship is so important and will last a lifetime. So the small extra benefit for defecting against your family is not worth it in the long run (Defecting in the scenario would be like asking to borrow a lot of money and never paying it back). Some people do defect against their family but it is most common to cooperate for a lifetime with them.

People tend to defect a little more often with friends. This is because you can always make more friends, while you can’t get a new family. We all had that friend that screwed us over one way or another. You both were playing a game and you chose to cooperate while they chose to defect. Then you learn to trust them/be friends with them. This is like not playing another game of prisoner’s dilemma because you know they will just defect. This is what makes long term friendships so valuable, you had countless times to defect, but you both choose to cooperate for the benefit of each other.

Business

Most common place to see people defect is in business. The lack of emotion and empathy while conducting business coupled with numerous business opportunities it is easy to get away with defecting. And some are known for their effective defecting abilities. Trump comes to mind here.

“Fool me once shame on you, fool me twice shame on me”. This saying although it doesn’t directly relate to the prisoner’s dilemma it does resemble a multiple iteration game. If you defect while I cooperate, shame on you for not being a good person and cooperating. In the next game if I cooperate and you defect again, shame on me from not learning the lesson that you don’t cooperate.

Everything is a Game

Almost everything in life is a some variation of the prisoner’s dilemma game. It is no longer a random scenario that is studied in higher education game theory classes. It is happening nearly everyday of your life. Just take a step back and look at your relationships, you’ll start to notice the game.

Leave a Comment

Your email address will not be published. Required fields are marked *