Simplex Algorithm for Countable-state Discounted Markov Decision Processes

Lee, Ilbin; Epelman, Marina A.; Romeijn, H. Edwin; Smith, Robert L.

Simplex Algorithm for Countable-state Discounted Markov Decision Processes

dc.contributor.author	Lee, Ilbin
dc.contributor.author	Epelman, Marina A.
dc.contributor.author	Romeijn, H. Edwin
dc.contributor.author	Smith, Robert L.
dc.date.accessioned	2014-11-18T16:06:37Z
dc.date.available	2014-11-18T16:06:37Z
dc.date.issued	2014-11-18
dc.identifier.uri	https://hdl.handle.net/2027.42/109413
dc.description	Submitted to Operations Research; preliminary version.	en_US
dc.description.abstract	We consider discounted Markov Decision Processes (MDPs) with countably-infinite state spaces, finite action spaces, and unbounded rewards. Typical examples of such MDPs are inventory management and queueing control problems in which there is no specific limit on the size of inventory or queue. Existing solution methods obtain a sequence of policies that converges to optimality in value but may not improve monotonically, i.e., a policy in the sequence may be worse than preceding policies. Our proposed approach considers countably-infinite linear programming (CILP) formulations of the MDPs (a CILP is defined as a linear program (LP) with countably-infinite numbers of variables and constraints). Under standard assumptions for analyzing MDPs with countably-infinite state spaces and unbounded rewards, we extend the major theoretical extreme point and duality results to the resulting CILPs. Under an additional technical assumption which is satisfied by several applications of interest, we present a simplex-type algorithm that is implementable in the sense that each of its iterations requires only a finite amount of data and computation. We show that the algorithm finds a sequence of policies which improves monotonically and converges to optimality in value. Unlike existing simplex-type algorithms for CILPs, our proposed algorithm solves a class of CILPs in which each constraint may contain an infinite number of variables and each variable may appear in an infinite number of constraints. A numerical illustration for inventory management problems is also presented.	en_US
dc.description.sponsorship	National Science Foundation grant CMMI-1333260	en_US
dc.description.sponsorship	A research grant from the University of Michigan	en_US
dc.language.iso	en_US	en_US
dc.subject	Simplex Algorithm	en_US
dc.subject	Infinite Linear Programs	en_US
dc.subject	Dynamic Programming	en_US
dc.title	Simplex Algorithm for Countable-state Discounted Markov Decision Processes	en_US
dc.type	Article	en_US
dc.subject.hlbsecondlevel	Industrial and Operations Engineering
dc.subject.hlbtoplevel	Engineering
dc.contributor.affiliationum	Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA	en_US
dc.contributor.affiliationumcampus	Ann Arbor	en_US
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/109413/1/CountableStateMDP-MAE.pdf
dc.description.filedescription	Description of CountableStateMDP-MAE.pdf : Main article (preliminary version)
dc.owningcollname	Industrial and Operations Engineering, Department of (IOE)

Files in this item

Name:: CountableStateMDP-MAE.pdf
Size:: 436.1KB
Format:: PDF
Description:: Main article (preliminary version)

View/Open

Industrial and Operations Engineering, Department of (IOE)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.