TY - JOUR
T1 - Adaptive control for discrete-time Markov processes with unbounded costs
T2 - Average criterion
AU - Gordienko, Evgueni I.
AU - Adolfo Minjárez-Sosa, J.
PY - 1998
Y1 - 1998
N2 - The paper deals with a class of discrete-time Markov control processes with Borel state and action spaces, and possibly unbounded one-stage costs. The processes are given by recurrent equations xt+1 = F(xt,at, ξt), t = 1,2, . . . with i.i.d. ξκ - valued random vectors ξt whose density p is unknown. Assuming observability of ξt and taking advantage of the procedure of statistical estimation of p used in a previous work by authors, we construct an average cost optimal adaptive policy.
AB - The paper deals with a class of discrete-time Markov control processes with Borel state and action spaces, and possibly unbounded one-stage costs. The processes are given by recurrent equations xt+1 = F(xt,at, ξt), t = 1,2, . . . with i.i.d. ξκ - valued random vectors ξt whose density p is unknown. Assuming observability of ξt and taking advantage of the procedure of statistical estimation of p used in a previous work by authors, we construct an average cost optimal adaptive policy.
KW - Adaptive policy
KW - Average cost criterion
KW - Markov control process
KW - Projection of estimator
KW - Rate of convergence
UR - http://www.scopus.com/inward/record.url?scp=0008635562&partnerID=8YFLogxK
U2 - 10.1007/PL00003993
DO - 10.1007/PL00003993
M3 - Artículo
AN - SCOPUS:0008635562
SN - 1432-2994
VL - 48
SP - 37
EP - 55
JO - Mathematical Methods of Operations Research
JF - Mathematical Methods of Operations Research
IS - 1
ER -