Abstract
This paper deals with finite nonzero-sum Markov games under a discounted optimality criterion and infinite horizon. The state process evolves according to a stochastic difference equation and depends on players' actions as well as a random disturbance whose distribution is unknown to the players. The actions, the states, and the values of the disturbance are observed by the players, then they use the empirical distribution of the disturbances to estimate the true distribution and make choices based on the available information. In this context, we propose an almost surely convergent procedure—possibly after passing to a subsequence—to approximate Nash equilibria of the Markov game with the true distribution of the random disturbance.
Original language | English |
---|---|
Pages (from-to) | 722-734 |
Number of pages | 13 |
Journal | Asian Journal of Control |
Volume | 25 |
Issue number | 2 |
DOIs | |
State | Published - Mar 2023 |
Bibliographical note
Publisher Copyright:© 2022 Chinese Automatic Control Society and John Wiley & Sons Australia, Ltd.
Keywords
- Markov games
- Nash equilibrium
- discounted criterion
- empirical estimation