Abstract
This paper deals with a class of partially observable discounted Markov decision processes defined on Borel state and action spaces, under unbounded one-stage cost. The discount rate is a stochastic process evolving according to a difference equation, which is also assumed to be partially observable. Introducing a suitable control model and filtering processes, we prove the existence of optimal control policies. In addition, we illustrate our results in a class of GI/GI/1 queueing systems where we obtain explicitly the corresponding optimality equation and the filtering process.
Original language | English |
---|---|
Pages (from-to) | 960-983 |
Number of pages | 24 |
Journal | Kybernetika |
Volume | 58 |
Issue number | 6 |
DOIs | |
State | Published - 2022 |
Bibliographical note
Publisher Copyright:© 2022 Institute of Information Theory and Automation of The Czech Academy of Sciences. All rights reserved.
Keywords
- discounted criterion
- optimal policies
- partially observable systems
- queueing models
- random discount factors