The Prisoner's Dilemma

 

The prisoners dilemma have been a rich source of research material since the 1950's. However, the publication of Axelrod's book [AXE84] in the 1980's was largely responsible for bringing this research to the attention to other areas, outside of game theory, including evolutionary computing, evolutionary biology, networked computer systems and promoting cooperation between opposing countries. Although the prisoners dilemma, in the context of game theory, has been an active research area for almost 60 years [SCO63, SCO62, SCO60a, SCO60b, SCO59a, SCO59b] (it can be traced back to von Neumann and Morgenstern [VON44] and, of course, John Nash [NAS53, NAS50]), it is still an active research area with a large number of scientific articles published every year.

 

The prisoners dilemma has a modern day version in the form of the TV show "Shafted" - a game show recently screened on terrestrial TV in the UK (note that this show is not a true prisoners dilemma as defined by Rapoport [RAP96], but does demonstrate that the ideas have wider applicability). At the end of the show two contestants have accumulated a sum of money and they have to decide if to share the money or to try and get all the money for themselves. Their decision is made without the knowledge of what the other person has decided to do. If both contestants cooperate then they share the money. If they both defect then they both receive nothing. If one cooperates and the other defects, the one that defected gets all the money and the contestant that cooperated gets nothing.

In the prisoners dilemma (PD) you have to decide whether to cooperate with an opponent, or defect. Both you and your opponent make a choice and then your decisions are revealed. You receive a payoff according to the following matrix (where the top line is the payoff to the column).

 

 

The opponent

Cooperate

Defect

You

Cooperate

           R=3

 R=3

         T=5

 S=0

Defect

          S=0

 T=5

         P=1

 P=1

 


The question arises: what should you do in such a game?


Suppose you think the other player will cooperate. If you cooperate then you will receive a payoff of 3 for mutual cooperation. If you defect then you will receive a payoff of 5 for the Temptation to Defect payoff. Therefore, if you think the other player will cooperate then you should defect, to give you a payoff of 5.


But what if you think the other player will defect? If you cooperate, then you get the Sucker payoff of zero. If you defect then you would both receive the Punishment for Mutual Defection of 1 point. Therefore, if you think the other player will defect, you should defect as well.


So, you should defect, no matter what option your opponent chooses. Of course, the same logic holds for your opponent. And, if you both defect you receive a payoff of 1 each, whereas, the better outcome would have been mutual cooperation with a payoff of 3. The dilemma each prisoner faces is that mutual defection seems to be inevitable but the payoff of mutual defection is less than that could have been achieved by two cooperating players.

 

Mutual defection is a Nash equilibrium of PD. Informally, a set of strategies is a Nash equilibrium if no player can do better by unilaterally changing his or her strategy. Mutual defection is the unique Nash equilibrium of PD, which means that it is the only stable solution to this game. In real world scenarios, however, a Nash equilibrium is not necessarily played. Some conditions to guarantee that the Nash equilibrium is played are:

  1. The players aim to maximize their own payoffs.
  2. The players know the Nash equilibrium strategy of all players.
  3. The players believe that a deviation in their own strategy will not cause deviations by any other players.
  4. There is common knowledge that all players know these conditions.