Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
Average Reward Timed Games
| |
|
Times Games and Synthesis
Average Reward Timed Games
Bo Thomas Adler1, Luca de Alfaro1 and Marco Faella1, 2
| (1) |
School of Engineering, University of California, Santa Cruz, USA |
| (2) |
Dipartimento di Scienze Fisiche, Università di Napoli “Federico II”, Italy |
Abstract
We consider real-time games where the goal consists, for each player, in maximizing the average reward he or she receives
per time unit. We consider zero-sum rewards, so that a reward of + r to one player corresponds to a reward of – r to the other player. The games are played on discrete-time game structures which can be specified using a two-player version
of timed automata whose locations are labeled by reward rates. Even though the rewards themselves are zero-sum, the games
are not, due to the requirement that time must progress along a play of the game.
Since we focus on control applications, we define the value of the game to a player to be the maximal average reward per
time unit that the player can ensure. We show that, in general, the values to players 1 and 2 do not sum to zero. We provide
algorithms for computing the value of the game for either player; the algorithms are based on the relationship between the
original, infinite-round game, and a derived game that is played for only finitely many rounds. As memoryless optimal strategies
exist for both players in both games, we show that the problem of computing the value of the game is in NP∩coNP.
This research was supported in part by the NSF CAREER award CCR-0132780, by the ONR grant N00014-02-1-0671, and by the ARP
award TO.030.MM.D.
Fulltext Preview (Small, Large)
|
|
|
|
|
|