A hybrid genetic/optimization algorithm for piecewise affine and convex Markov decision processes.
Lin, Zong-Zhi
1999
Abstract
A variety of finite horizon models of sequential decision making under uncertainty have value functions that are piecewise affine and convex (<italic> PAC</italic>) or equivalently, can be represented by a finite set of affine functions. Such models include: the partially observed Markov decision problem (<italic>POMDP</italic>) and the Markov decision problem (<italic>MDP</italic>) with affine single-step and/or terminal reward function, examples of which are the multi-objective <italic>MDP</italic> and the nonhomogeneous <italic> MDP</italic> (<italic>NMDP</italic>). We develop both optimal and suboptimal solution procedures for this class of models. Our optimal solution approach, <italic> GAMIP</italic>, integrates a genetic algorithm (<italic>GA</italic>) and a mixed integer program (<italic>MIP</italic>). The genetic algorithm step only is a suboptimal design with good solution quality. Both optimal approach and suboptimal design construct the minimal set of affine functions that describe the value function. The <italic>POMDP</italic> and <italic>NMDP</italic> are investigated numerically. The <italic>POMDP</italic> is a generalization of a <italic>MDP</italic> that allows for observations that are probabilistically related to the underlying system state. The value function of the finite horizon <italic>POMDP</italic> is known to be <italic>PAC</italic> in the probability mass vector over the state space. For suboptimal design, we develop two heuristic approaches based on genetic algorithms, termed as dc-niche and spatial, to construct approximations of the minimal set of affine functions that describes the value function. Error bounds of these two approaches are presented. Numerical results indicate that these heuristics can quickly yield high quality solutions for a set of small-scale problems where exact solutions are feasible and can provide suboptimal designs for realistically sized problems. Numerical results indicate that <italic> GAMIP</italic> takes up to 60% less time than the most efficient linear programming-based exact solution method in the literature to construct the minimal set. For the study of an infinite horizon <italic>NMDP</italic>, the objective is to determine an optimal initial action. We transform this infinite horizon <italic> NMDP</italic> into a finite horizon <italic>NMDP</italic> having a terminal reward vector described by set inclusion. Our approach involves finding a forecast horizon, a problem horizon such that an optimal initial decision for the finite horizon problem is optimal for the infinite horizon problem. We make use of the fact that the value function of the finite horizon <italic> NMDP</italic> having a terminal reward vector described by set inclusion is <italic> PAC</italic> in the terminal reward vector. Our solution procedure is based on a combination of the genetic algorithm and mixed integer programming. We show that a suboptimal procedure based solely on the genetic algorithm quickly identifies high quality candidates for the minimal forecast horizon. The mixed integer program is used to confirm or rule out a candidate for a forecast horizon. We then present a numerical evaluation that compares our approach with several other algorithms.Subjects
Algorithm Con Genetic Algorithms Hybrid Markov Decision Processes Mixed Integer Program Optimization Piecewise Affine And Convex
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.