TY - JOUR
T1 - Some advances on constrained Markov decision processes in Borel spaces with random state-dependent discount factors
AU - Jasso-Fuentes, Héctor
AU - López-Martínez, Raquiel R.
AU - Minjárez-Sosa, J. Adolfo
N1 - Publisher Copyright:
© 2022 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2024
Y1 - 2024
N2 - This paper addresses a class of discrete-time Markov decision processes in Borel spaces with a finite number of cost constraints. The constrained control model considers costs of discounted type with state-dependent discount factors which are subject to external disturbances. Our objective is to prove the existence of optimal control policies and characterize them according to certain optimality criteria. Specifically, by rewriting appropriately our original constrained problem as a new one on a space of occupation measures, we apply the direct method to show solvability. Next, the problem is defined as a convex program, and we prove that the existence of a saddle point of the associated Lagrangian operator is equivalent to the existence of an optimal control policy for the constrained problem. Finally, we turn our attention to multi-objective optimization problems, where the existence of Pareto optimal policies can be obtained from the existence of saddle-points of the aforementioned Lagrangian or equivalently from the existence of optimal control policies of constrained problems.
AB - This paper addresses a class of discrete-time Markov decision processes in Borel spaces with a finite number of cost constraints. The constrained control model considers costs of discounted type with state-dependent discount factors which are subject to external disturbances. Our objective is to prove the existence of optimal control policies and characterize them according to certain optimality criteria. Specifically, by rewriting appropriately our original constrained problem as a new one on a space of occupation measures, we apply the direct method to show solvability. Next, the problem is defined as a convex program, and we prove that the existence of a saddle point of the associated Lagrangian operator is equivalent to the existence of an optimal control policy for the constrained problem. Finally, we turn our attention to multi-objective optimization problems, where the existence of Pareto optimal policies can be obtained from the existence of saddle-points of the aforementioned Lagrangian or equivalently from the existence of optimal control policies of constrained problems.
KW - 90C40
KW - 93E20
KW - Markov decision processes
KW - Pareto optimality
KW - constrained control problems
KW - convex programming
KW - random non-constant discount factor
UR - http://www.scopus.com/inward/record.url?scp=85139747145&partnerID=8YFLogxK
U2 - 10.1080/02331934.2022.2130699
DO - 10.1080/02331934.2022.2130699
M3 - Artículo
AN - SCOPUS:85139747145
SN - 0233-1934
VL - 73
SP - 925
EP - 951
JO - Optimization
JF - Optimization
IS - 4
ER -