Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Reaching pareto-optimality in prisoner’s dilemma using conditional joint action learning

Dipyaman BanerjeeContact Information and Sandip SenContact Information

(1)  Department of Computer Science, University of Tulsa, Tulsa, OK, USA

Published online: 30 April 2007

Abstract  We consider the learning problem faced by two self-interested agents repeatedly playing a general-sum stage game. We assume that the players can observe each other’s actions but not the payoffs received by the other player. The concept of Nash Equilibrium in repeated games provides an individually rational solution for playing such games and can be achieved by playing the Nash Equilibrium strategy for the single-shot game in every iteration. Such a strategy, however can sometimes lead to a Pareto-Dominated outcome for games like Prisoner’s Dilemma. So we prefer learning strategies that converge to a Pareto-Optimal outcome that also produces a Nash Equilibrium payoff for repeated two-player, n-action general-sum games. The Folk Theorem enable us to identify such outcomes. In this paper, we introduce the Conditional Joint Action Learner (CJAL) which learns the conditional probability of an action taken by the opponent given its own actions and uses it to decide its next course of action. We empirically show that under self-play and if the payoff structure of the Prisoner’s Dilemma game satisfies certain conditions, a CJAL learner, using a random exploration strategy followed by a completely greedy exploitation technique, will learn to converge to a Pareto-Optimal solution. We also show that such learning will generate Pareto-Optimal payoffs in a large majority of other two-player general sum games. We compare the performance of CJAL with that of existing algorithms such as WOLF-PHC and JAL on all structurally distinct two-player conflict games with ordinal payoffs.

Keywords  Multiagent learning - Game theory - Prisoner’s dilemma


Contact Information Dipyaman Banerjee (Corresponding author)
Email: dipyaman@gmail.com

Contact Information Sandip Sen
Email: sandip@utulsa.edu
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this article
Export this article as RIS | Text
 
Referenced by
1 newer article

  1. Jiang, Daoping (2009) A multi-agent local-learning algorithm under group enviroment. Journal of Electronics (China) 26(2)
    [CrossRef]
Remote Address: 38.107.191.112 • Server: mpweb18
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)