## Abstract

The paper deals with a class of discounted discrete-time Markov control models with non-constant discount factors of the form $$\tilde{\alpha } (x_{n},a_{n},\xi _{n+1})$$α~(xn,an,ξn+1), where $$x_{n},a_{n},$$xn,an, and $$\xi _{n+1}$$ξn+1 are the state, the action, and a random disturbance at time $$n,$$n, respectively, taking values in Borel spaces. Assuming that the one-stage cost is possibly unbounded and that the distributions of $$\xi _{n}$$ξn are unknown, we study the corresponding optimal control problem under two settings. Firstly we assume that the random disturbance process $$\left\{ \xi _{n}\right\} $$ξn is formed by observable independent and identically distributed random variables, and then we introduce an estimation and control procedure to construct strategies. Instead, in the second one, $$\left\{ \xi _{n}\right\} $$ξn is assumed to be non-observable whose distributions may change from stage to stage, and in this case the problem is studied as a minimax control problem in which the controller has an opponent selecting the distribution of the corresponding random disturbance at each stage.

Original language | English |
---|---|

Pages (from-to) | 743-772 |

Number of pages | 30 |

Journal | TOP |

Volume | 23 |

Issue number | 3 |

DOIs | |

State | Published - 1 Oct 2015 |

### Bibliographical note

Publisher Copyright:© 2015, Sociedad de Estadística e Investigación Operativa.

## Keywords

- Discounted optimality
- Estimation and control procedures
- Minimax control systems
- Non-constant discount factors