Resumen
We consider a class of discrete-time two person zero-sum Markov games with Borel state and action spaces, and possibly unbounded payoffs. The game evolves according to the recursive equation xn+1=F(xn, an, bn, ξn), n=0, 1, . . ., where the disturbance process {ξn} is formed by independent and identically distributed Rk-valued random vectors, which are observable but their common density ρ* is unknown for both players. Combining suitable methods of statistical estimation of ρ* with optimization procedures, we construct a pair of average optimal strategies.
| Idioma original | Inglés |
|---|---|
| Páginas (desde-hasta) | 44-56 |
| Número de páginas | 13 |
| Publicación | Journal of Mathematical Analysis and Applications |
| Volumen | 402 |
| N.º | 1 |
| DOI | |
| Estado | Publicada - 1 jun. 2013 |
Huella
Profundice en los temas de investigación de 'Optimal strategies for adaptive zero-sum average Markov games'. En conjunto forman una huella única.Citar esto
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver