On the Importance of Inherent Structural Properties for Learning in Markov Decision Processes

Adler, Saghar

On the Importance of Inherent Structural Properties for Learning in Markov Decision Processes

dc.contributor.author	Adler, Saghar
dc.date.accessioned	2024-05-22T17:24:56Z
dc.date.available	2024-05-22T17:24:56Z
dc.date.issued	2024
dc.date.submitted	2024
dc.identifier.uri	https://hdl.handle.net/2027.42/193341
dc.description.abstract	Recently, reinforcement learning methodologies have been applied to solve sequential decision-making problems in various fields, such as robotics and autonomous control, communication and networking, and resource allocation and scheduling. Despite great practical success, there has been less progress in developing theoretical performance guarantees for such complex systems. This dissertation aims to address the limitations of current theoretical frameworks and extend the applicability of learning-based control methods to more complex, real-life domains discussed above. This objective is achieved in two different settings using the inherent structural properties of the Markov decision processes used to model such systems. For admission control in systems modeled by the Erlang-B blocking model with unknown arrival and service rates, in the first setting, we use model knowledge to compensate for the lack of reward signals. Here, we propose a learning algorithm based on the self-tuning adaptive control and not only prove that our algorithm is asymptotically optimal but also provide finite-time regret guarantees. The second setting develops a framework to address the challenge of applying reinforcement learning methods to Markov decision processes with countably infinite state spaces and unbounded cost functions. An existing learning algorithm based on Thompson sampling with dynamically-sized episodes is extended to countably infinite state space using the ergodicity properties of Markov decision processes. We establish asymptotic optimality of our learning-based control policy by providing a sub-linear (in time-horizon) regret guarantee. Our framework is focused on models that arise in queueing system models of communication networks, computing systems, and processing networks. Hence, to demonstrate the applicability of our method, we also apply it to the problem of controlling two queueing systems with unknown dynamics.
dc.language.iso	en_US
dc.subject	Reinforcement Learning
dc.subject	Learning in queueing systems
dc.title	On the Importance of Inherent Structural Properties for Learning in Markov Decision Processes
dc.type	Thesis
dc.description.thesisdegreename	PhD
dc.description.thesisdegreediscipline	Electrical and Computer Engineering
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Subramanian, Vijay Gautam
dc.contributor.committeemember	Cohen, Asaf
dc.contributor.committeemember	Liu, Mingyan
dc.contributor.committeemember	Ying, Lei
dc.subject.hlbsecondlevel	Electrical Engineering
dc.subject.hlbtoplevel	Engineering
dc.contributor.affiliationumcampus	Ann Arbor
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/193341/1/ssaghar_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/22986
dc.identifier.orcid	0009-0004-8214-0222
dc.identifier.name-orcid	Adler, Saghar; 0009-0004-8214-0222	en_US
dc.working.doi	10.7302/22986	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: ssaghar_1.pdf
Size:: 1.889MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.