Empirical approximation in markov games under unbounded payoff: Discounted and average criteria

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

This work deals with a class of discrete-time zero-sum Markov games whose state process fxtg evolves according to the equation xt+1 = F(xt; at; bt; ϵt); where at and bt represent the actions of player 1 and 2, respectively, and {ϵt} is a sequence of independent and identically distributed random variables with unknown distribution θ: Assuming possibly unbounded payo θ, and using the empirical distribution to estimate θ; we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.

Original languageEnglish
Pages (from-to)694-716
Number of pages23
JournalKybernetika
Volume53
Issue number4
DOIs
StatePublished - 2017

Bibliographical note

Funding Information:
This work was partially supported by Consejo Nacional de Ciencia y Tecnología (CONACYT) under grant CB2015/254306.

Keywords

  • Discounted and average criteria
  • Empirical estimation
  • Markov games

Fingerprint

Dive into the research topics of 'Empirical approximation in markov games under unbounded payoff: Discounted and average criteria'. Together they form a unique fingerprint.

Cite this