Abstract
The paper deals with a class of discounted discrete-time Markov control models with non-constant discount factors of the form $$\tilde{\alpha } (x_{n},a_{n},\xi _{n+1})$$α~(xn,an,ξn+1), where $$x_{n},a_{n},$$xn,an, and $$\xi _{n+1}$$ξn+1 are the state, the action, and a random disturbance at time $$n,$$n, respectively, taking values in Borel spaces. Assuming that the one-stage cost is possibly unbounded and that the distributions of $$\xi _{n}$$ξn are unknown, we study the corresponding optimal control problem under two settings. Firstly we assume that the random disturbance process $$\left\{ \xi _{n}\right\} $$ξn is formed by observable independent and identically distributed random variables, and then we introduce an estimation and control procedure to construct strategies. Instead, in the second one, $$\left\{ \xi _{n}\right\} $$ξn is assumed to be non-observable whose distributions may change from stage to stage, and in this case the problem is studied as a minimax control problem in which the controller has an opponent selecting the distribution of the corresponding random disturbance at each stage.
Original language | English |
---|---|
Pages (from-to) | 743-772 |
Number of pages | 30 |
Journal | TOP |
Volume | 23 |
Issue number | 3 |
DOIs | |
State | Published - 1 Oct 2015 |
Bibliographical note
Publisher Copyright:© 2015, Sociedad de Estadística e Investigación Operativa.
Keywords
- Discounted optimality
- Estimation and control procedures
- Minimax control systems
- Non-constant discount factors