Computationally Efficient Relational Reinforcement Learning

Bloch, Mitchell

Computationally Efficient Relational Reinforcement Learning

dc.contributor.author	Bloch, Mitchell
dc.date.accessioned	2018-10-25T17:38:29Z
dc.date.available	NO_RESTRICTION
dc.date.available	2018-10-25T17:38:29Z
dc.date.issued	2018
dc.date.submitted	2018
dc.identifier.uri	https://hdl.handle.net/2027.42/145859
dc.description.abstract	Relational Reinforcement Learning (RRL) is a technique that enables Reinforcement Learning (RL) agents to generalize from their experience, allowing them to learn over large or potentially infinite state spaces, to learn context sensitive behaviors, and to learn to solve variable goals and to transfer knowledge between similar situations. Prior RRL architectures are not sufficiently computationally efficient to see use outside of small, niche roles within larger Artificial Intelligence (AI) architectures. I present a novel online, incremental RRL architecture and an implementation that is orders of magnitude faster than its predecessors. The first aspect of this architecture that I explore is a computationally efficient implementation of an adaptive Hierarchical Tile Coding (aHTC), a kind of Adaptive Tile Coding (ATC) in which more general tiles which cover larger portions of the state-action space are kept as ones that cover smaller portions of the state-action space are introduced, using k-dimensional tries (k-d tries) to implement the value function for non-relational Temporal Difference (TD) methods. In order to achieve comparable performance for RRL, I implement the Rete algorithm to replace my k-d tries due to its efficient handling of both the variable binding problem and variable numbers of actions. Tying aHTCs and Rete together, I present a rule grammar that both maps aHTCs onto Rete and allows the architecture to automatically extract relational features in order to support adaptation of the value function over time. I experiment with several refinement criteria and additional functionality with which my agents attempt to determine if rerefinement using different features might allow them to better learn a near optimal policy. I present optimal results using a value criterion for several variants of BlocksWorld. I provide transfer results for BlocksWorld and a scalable Taxicab domain. I additionally introduce a Higher Order Grammar (HOG) that grants online, incremental RRL agents additional flexibility to introduce additional variables and corresponding relations as needed in order to learn effective value functions. I evaluate agents that use the HOG on a version of Blocks World and on an Adventure task. In summary, I present a new online, incremental RRL architecture, a grammar to map aHTCs onto the Rete, and an implementation that is orders of magnitude faster than its predecessors.
dc.language.iso	en_US
dc.subject	Relational Reinforcement Learning
dc.subject	Rete
dc.subject	Adaptive Tile Coding
dc.subject	Online Learning
dc.subject	Sequential Decision Making
dc.title	Computationally Efficient Relational Reinforcement Learning
dc.type	Thesis	en_US
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Computer Science & Engineering
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Laird, John E
dc.contributor.committeemember	Lewis, Richard L
dc.contributor.committeemember	Baveja, Satinder Singh
dc.contributor.committeemember	Durfee, Edmund H
dc.subject.hlbsecondlevel	Computer Science
dc.subject.hlbtoplevel	Engineering
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/145859/1/bazald_1.pdf
dc.identifier.orcid	0000-0002-7219-4786
dc.identifier.name-orcid	Bloch, Mitchell; 0000-0002-7219-4786	en_US
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: bazald_1.pdf
Size:: 4.723MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.