Markov control models with unknown random state–action-dependent discount factors

J. Adolfo Minjárez-Sosa*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

The paper deals with a class of discounted discrete-time Markov control models with non-constant discount factors of the form $$\tilde{\alpha } (x_{n},a_{n},\xi _{n+1})$$α~(xn,an,ξn+1), where $$x_{n},a_{n},$$xn,an, and $$\xi _{n+1}$$ξn+1 are the state, the action, and a random disturbance at time $$n,$$n, respectively, taking values in Borel spaces. Assuming that the one-stage cost is possibly unbounded and that the distributions of $$\xi _{n}$$ξn are unknown, we study the corresponding optimal control problem under two settings. Firstly we assume that the random disturbance process $$\left\{ \xi _{n}\right\} $$ξn is formed by observable independent and identically distributed random variables, and then we introduce an estimation and control procedure to construct strategies. Instead, in the second one, $$\left\{ \xi _{n}\right\} $$ξn is assumed to be non-observable whose distributions may change from stage to stage, and in this case the problem is studied as a minimax control problem in which the controller has an opponent selecting the distribution of the corresponding random disturbance at each stage.

Original languageEnglish
Pages (from-to)743-772
Number of pages30
JournalTOP
Volume23
Issue number3
DOIs
StatePublished - 1 Oct 2015

Bibliographical note

Publisher Copyright:
© 2015, Sociedad de Estadística e Investigación Operativa.

Keywords

  • Discounted optimality
  • Estimation and control procedures
  • Minimax control systems
  • Non-constant discount factors

Fingerprint

Dive into the research topics of 'Markov control models with unknown random state–action-dependent discount factors'. Together they form a unique fingerprint.

Cite this