Show simple item record

Provably Efficient Reinforcement Learning Under Linear Model Structures: From Tabular to Feature Based Exploration

dc.contributor.authorModi, Aditya
dc.date.accessioned2022-01-19T15:23:53Z
dc.date.available2022-01-19T15:23:53Z
dc.date.issued2021
dc.identifier.urihttps://hdl.handle.net/2027.42/171361
dc.description.abstractReinforcement learning (RL) is a machine learning paradigm where an agent learns to interact with an environment in the presence of a reward signal as feedback. Recent breakthroughs have led to a renewed interest in building intelligent RL agents by combining the core RL algorithms with the expressivity of deep function approximators and advances in computation via simulation. Despite the recent advances, in most complex domains RL algorithms need a large amount of interaction data in order to learn a good policy. As a result, recent theoretical research has focused on problems pertaining to the quantification and/or improvement of sample efficiency of RL under various interaction protocols. These efforts are directed towards understanding the statistical aspects of reinforcement learning, which can be a key factor in making progress towards real world applications, ranging from healthcare and robotics to control of large scale web systems. The main theme of this thesis is the analysis of such information-theoretic aspects of RL in terms of the structural complexity of the environment by using tools from the learning theory literature. In this thesis, we consider a spectrum of scenarios: ranging from tabular to rich observation domains, single to multi-task settings and reward-specific to reward-free learning. Specifically, this thesis presents theoretical results in the following settings: the 1st part studies the sample efficiency of online learning in contextual Markov decision processes (MDPs) where the agent interacts with a sequence of tabular environments and has access to a high-dimensional representation that determines the dynamics of each MDP. The 2nd part studies the sample complexity of learning in a structured model class where the true model of an arbitrarily large MDP can be approximated using a feature-based linear combination of a known ensemble of models. The 3rd part investigates the problem of learning in a low-rank MDP where I design and analyze the first provably efficient model-free representation learning and exploration algorithm for reward-free learning. Lastly, in the 4th part, I provide results for online multi-task learning in linear quadratic regulators, under the assumption that the transition matrices for each system share a common basis. A common thread running through all results in this thesis is the effect of a linear/low-rank structure in the model on the sample efficiency of learning. Overall, this thesis proposes novel provably efficient exploration and model selection algorithms under various linear model structures, and highlights the role of environment complexity in the statistical efficiency of reinforcement learning.
dc.language.isoen_US
dc.subjectReinforcement learning and control
dc.subjectStatistical and online learning theory
dc.subjectLinear systems
dc.subjectEfficient exploration
dc.subjectSequential decision making
dc.titleProvably Efficient Reinforcement Learning Under Linear Model Structures: From Tabular to Feature Based Exploration
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineComputer Science & Engineering
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberBaveja, Satinder Singh
dc.contributor.committeememberTewari, Ambuj
dc.contributor.committeememberScott, Clayton D
dc.contributor.committeememberCheraghchi, Mahdi
dc.contributor.committeememberJiang, Nan
dc.subject.hlbsecondlevelComputer Science
dc.subject.hlbsecondlevelMathematics
dc.subject.hlbtoplevelEngineering
dc.subject.hlbtoplevelScience
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/171361/1/admodi_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/3873
dc.identifier.orcid0000-0002-6959-0593
dc.identifier.name-orcidModi, Aditya; 0000-0002-6959-0593en_US
dc.working.doi10.7302/3873en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.