Abstract
We consider a class of discrete-time two person zero-sum Markov games with Borel state and action spaces, and possibly unbounded payoffs. The ga me evolves according to the recursive equation xn+1 = F(xn,a n,bn,n), n = 0, 1,..., where the disturbance process {n} is formed by independent and identically distributed Rk-valued random vectors, which are observable but whose common density p is unknown to both players. Under certain continuity and compactness conditions, we combine a nonstationary iteration procedure and suitable den sity estimation methods to construct asymptotically discounted optimal strategies for both players.
Original language | English |
---|---|
Pages (from-to) | 1405-1421 |
Number of pages | 17 |
Journal | SIAM Journal on Control and Optimization |
Volume | 48 |
Issue number | 3 |
DOIs | |
State | Published - 2009 |
Keywords
- Adaptive strategies
- Asymptotic opti- Mality
- Discounted payoff
- Zero-sum markov games