A perturbation approach to a class of discounted approximate value iteration algorithms with borel spaces

Óscar Vega-Amaya*, Joaquín López-Borbón

*Autor correspondiente de este trabajo

Producción científica: Contribución a una revistaArtículorevisión exhaustiva

2 Citas (Scopus)

Resumen

The present paper gives computable performance bounds for the approximate value iteration (AVI) algorithm when are used approximation operators satisfying the following properties: (i) they are positive linear operators; (ii) constant functions are fixed points of such operators; (iii) they have certain continuity property. Such operators define transition probabilities on the state space of the controlled systems. This has two important consequences: (a) one can see the approximating function as the average value of the target function with respect to the induced transition probability; (b) the approximation step in the AVI algorithm can be thought of as a perturbation of the original Markov model. These two facts enable us to give finite-time bounds for the AVI algorithm performance depending on the operators accuracy to approximate the cost function and the transition law of the system. The results are illustrated with numerical approximations for a class of inventory systems.

Idioma originalInglés
Páginas (desde-hasta)261-278
Número de páginas18
PublicaciónJournal of Dynamics and Games
Volumen3
N.º3
DOI
EstadoPublicada - 2016

Nota bibliográfica

Publisher Copyright:
© 2016, American Institute of Mathematical Sciences.

Huella

Profundice en los temas de investigación de 'A perturbation approach to a class of discounted approximate value iteration algorithms with borel spaces'. En conjunto forman una huella única.

Citar esto