TY - JOUR
T1 - Empirical approximation in markov games under unbounded payoff
T2 - Discounted and average criteria
AU - Luque-Vásquez, Fernando
AU - Minjárez-Sosa, J. Adolfo
N1 - Funding Information:
This work was partially supported by Consejo Nacional de Ciencia y Tecnología (CONACYT) under grant CB2015/254306.
PY - 2017
Y1 - 2017
N2 - This work deals with a class of discrete-time zero-sum Markov games whose state process fxtg evolves according to the equation xt+1 = F(xt; at; bt; ϵt); where at and bt represent the actions of player 1 and 2, respectively, and {ϵt} is a sequence of independent and identically distributed random variables with unknown distribution θ: Assuming possibly unbounded payo θ, and using the empirical distribution to estimate θ; we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.
AB - This work deals with a class of discrete-time zero-sum Markov games whose state process fxtg evolves according to the equation xt+1 = F(xt; at; bt; ϵt); where at and bt represent the actions of player 1 and 2, respectively, and {ϵt} is a sequence of independent and identically distributed random variables with unknown distribution θ: Assuming possibly unbounded payo θ, and using the empirical distribution to estimate θ; we introduce approximation schemes for the value of the game as well as for optimal strategies considering both, discounted and average criteria.
KW - Discounted and average criteria
KW - Empirical estimation
KW - Markov games
UR - http://www.scopus.com/inward/record.url?scp=85033800462&partnerID=8YFLogxK
U2 - 10.14736/kyb-2017-4-0694
DO - 10.14736/kyb-2017-4-0694
M3 - Artículo
SN - 0023-5954
VL - 53
SP - 694
EP - 716
JO - Kybernetika
JF - Kybernetika
IS - 4
ER -