Asymptotic optimality of finite model approximations for partially observed markov decision processes with discounted cost

Saldı, Naci; Yuksel, S.; Linder, T.

dc.contributor.author	Saldı, Naci
dc.contributor.author	Yuksel, S.
dc.contributor.author	Linder, T.
dc.date.accessioned	2021-02-10T12:28:01Z
dc.date.available	2021-02-10T12:28:01Z
dc.date.issued	2020-01
dc.identifier.issn	0018-9286	en_US
dc.identifier.uri	http://hdl.handle.net/10679/7292
dc.identifier.uri	https://ieeexplore.ieee.org/document/8673573
dc.description.abstract	We consider finite model approximations of discrete-time partially observed Markov decision processes (POMDPs) under the discounted cost criterion. After converting the original partially observed stochastic control problem to a fully observed one on the belief space, the finite models are obtained through the uniform quantization of the state and action spaces of the belief space Markov decision process (MDP). Under mild assumptions on the components of the original model, it is established that the policies obtained from these finite models are nearly optimal for the belief space MDP, and so, for the original partially observed problem. The assumptions essentially require that the belief space MDP satisfies a mild weak continuity condition. We provide an example and introduce explicit approximation procedures for the quantization of the set of probability measures on the state space of POMDP (i.e., belief space).	en_US
dc.description.sponsorship	Natural Sciences and Engineering Research Council of Canada (NSERC)
dc.language.iso	eng	en_US
dc.publisher	IEEE	en_US
dc.relation.ispartof	IEEE Transactions on Automatic Control
dc.rights	restrictedAccess
dc.title	Asymptotic optimality of finite model approximations for partially observed markov decision processes with discounted cost	en_US
dc.type	Article	en_US
dc.peerreviewed	yes	en_US
dc.publicationstatus	Published	en_US
dc.contributor.department	Özyeğin University
dc.contributor.authorID	(ORCID 0000-0002-2677-7366 & YÖK ID 283091) Saldı, Naci
dc.contributor.ozuauthor	Saldı, Naci
dc.identifier.volume	65	en_US
dc.identifier.issue	1	en_US
dc.identifier.startpage	130	en_US
dc.identifier.endpage	142	en_US
dc.identifier.wos	WOS:000506851100010
dc.identifier.doi	10.1109/TAC.2019.2907172	en_US
dc.subject.keywords	Aerospace electronics	en_US
dc.subject.keywords	Convergence	en_US
dc.subject.keywords	Quantization (signal)	en_US
dc.subject.keywords	Markov processes	en_US
dc.subject.keywords	Computational modeling	en_US
dc.subject.keywords	Cost function	en_US
dc.subject.keywords	Approximations	en_US
dc.subject.keywords	Markov decision processes	en_US
dc.subject.keywords	Non-linear filtering	en_US
dc.subject.keywords	Quantization	en_US
dc.subject.keywords	Stochastic control	en_US
dc.identifier.scopus	SCOPUS:2-s2.0-85077786832
dc.contributor.authorMale	1
dc.relation.publicationcategory	Article - International Refereed Journal - Institutional Academic Staff