Access the full text.
Sign up today, get DeepDyve free for 14 days.
D. Bertsekas, S. Shreve (2007)
Stochastic optimal control : the discrete time case
A. Jaśkiewicz, A. Nowak (2011)
Discounted dynamic programming with unbounded returns: Application to economic modelsJournal of Mathematical Analysis and Applications, 378
(1979)
On dynamic programming and statistical desision theory
M. Schäl (1975)
On dynamic programming: Compactness of the space of policiesStochastic Processes and their Applications, 3
E. Balder (1992)
Existence Without Explicit Compactness in Stochastic Dynamic ProgrammingMath. Oper. Res., 17
R. Kertz, David Nachman (1979)
Persistently Optimal Plans for Nonstationary Dynamic Programming: The Topology of Weak Convergence CaseAnnals of Probability, 7
E. Balder (1989)
On compactness of the space of policies in stochastic dynamic programmingStochastic Processes and their Applications, 32
M Schäl (1975)
Conditions for optimality in dynamic programming and for the limit of $$n$$ n -stage optimal policies to be optimalZ. Wahrscheinlichkeitstheorie Verw. Geb., 32
A. Zapała (2008)
Unbounded mappings and weak convergence of measuresStatistics & Probability Letters, 78
J. Wessels (1976)
Markov programming by successive approximations by respect to weighted supremum normsAdvances in Applied Probability, 8
A. Nowak (1988)
On the weak topology on a space of probability measures induced by policiesBulletin of The Polish Academy of Sciences Mathematics, 36
J. Matkowski, A. Nowak (2011)
On discounted dynamic programming with unbounded returnsEconomic Theory, 46
K. Hinderer (1970)
Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter
A. Jaśkiewicz, A. Nowak (2011)
Stochastic Games with Unbounded Payoffs: Applications to Robust Control in EconomicsDynamic Games and Applications, 1
J Wessels (1977)
Markov programming by successive approximations with respect to weighted supremum normsJ. Math. Anal. Appl., 58
(2007)
Measure Theory, vol
VI Bogachev (2007)
Measure Theory
A. Jaśkiewicz, J. Matkowski, A. Nowak (2014)
Generalised discounting in dynamic programming with unbounded returnsOper. Res. Lett., 42
M. Schäl (1975)
Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimalZeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 32
We consider a discrete-time Markov decision process with Borel state and action spaces. The performance criterion is to maximize a total expected utility determined by unbounded return function. It is shown the existence of optimal strategies under general conditions allowing the reward function to be unbounded both from above and below and the action sets available at each step to the decision maker to be not necessarily compact. To deal with unbounded reward functions, a new characterization for the weak convergence of probability measures is derived. Our results are illustrated by examples. Keywords Markov decision processes · Expected total reward · Unbounded return · Weak convergence of measure Mathematics Subject Classification 90C40 · 60J05 1 Introduction In this paper, our objective is to provide sufficient conditions for the existence of optimal strategies in dynamic programming decision models under the expected total reward criterion. The model under consideration is rather general since the reward function may be unbounded both from above and below and the action sets available at each step to the decision maker may not be necessarily compact. A. Genadot alexandre.genadot@math.u-bordeaux.fr F. Dufour francois.dufour@math.u-bordeaux.fr Institut Polytechnique de Bordeaux, INRIA Bordeaux Sud Ouest, Team: CQFD, IMB, Institut de Mathématiques
Applied Mathematics and Optimization – Springer Journals
Published: Oct 23, 2018
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.