Leveraging Compositional Structure for Reinforcement Learning and Sequential Decision Making Problems
dc.contributor.author | Liu, Anthony | |
dc.date.accessioned | 2025-05-12T17:36:03Z | |
dc.date.available | 2025-05-12T17:36:03Z | |
dc.date.issued | 2025 | |
dc.date.submitted | 2025 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/197133 | |
dc.description.abstract | Deep Learning approaches have made tremendous progress toward solving reinforcement learning and sequential decision-making problems. However, current approaches still struggle with long-horizon tasks that require strong generalization. These are tasks that an agent must solve using many actions and may have situations where the agent must generalize its actions from prior experience. A dominant approach is to solve these tasks in a hierarchical manner: a high-level agent decomposes a task into multiple “subtasks” to be individually solved by a low-level agent, which specializes in solving these subtasks. The effectiveness of this approach is enhanced by identifying and utilizing the inherent compositional structures of tasks, which enable more efficient learning and broader generalization. This dissertation introduces novel methodologies that build on task compositionality to address these challenges. Key contributions include: (1) Higher-Order Skill Learning: A hierarchical reinforcement learning framework is proposed, enabling low-level policies to optimize for sequences of subtasks rather than individual ones, resulting in improved efficiency and performance. (2) Parameterized Task Structures: Introducing parameterized subtask graphs to model tasks with compositional structures, enhancing both the efficiency of task inference and generalization to unseen entities. In addition, contributions that show compositional structure can be used through language and large language models (LLMs): (3) Integrating Multimodal Observations in Language Models: Demonstrating that visual observations can be embedded as input tokens for large language models (LLMs), achieving state-of-the-art performance in visually grounded planning tasks. (4) Skill Abstractions for LLM Planning: Highlighting the benefits of providing LLMs with structured descriptions of subtasks, or skills, to improve planning and reasoning capabilities. (5) Code-Augmented Planning: Proposing a method where LLMs use control flow constructs to generate and execute code for solving complex planning tasks, significantly improving task performance. Collectively, these approaches showcase how leveraging task compositionality through hierarchical structures and language from LLMs can improve reinforcement learning and sequential decision-making frameworks. | |
dc.language.iso | en_US | |
dc.subject | hierarchical reinforcement learning | |
dc.subject | planning | |
dc.subject | task decomposition | |
dc.subject | large language models | |
dc.subject | task generalization | |
dc.title | Leveraging Compositional Structure for Reinforcement Learning and Sequential Decision Making Problems | |
dc.type | Thesis | |
dc.description.thesisdegreename | PhD | |
dc.description.thesisdegreediscipline | Computer Science & Engineering | |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | |
dc.contributor.committeemember | Lee, Honglak | |
dc.contributor.committeemember | Ying, Lei | |
dc.contributor.committeemember | Baveja, Satinder Singh | |
dc.contributor.committeemember | Chai, Joyce | |
dc.subject.hlbsecondlevel | Computer Science | |
dc.subject.hlbtoplevel | Engineering | |
dc.contributor.affiliationumcampus | Ann Arbor | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/197133/1/anthliu_1.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/25559 | |
dc.identifier.orcid | 0009-0005-3871-4206 | |
dc.working.doi | 10.7302/25559 | en |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.