Learning True Objectives: Linear Algebraic Characterizations of Identifiability in Inverse Reinforcement Learning
dc.contributor.author | Shehab, Mohamad Louai | |
dc.contributor.author | Aspeel, Antoine | |
dc.contributor.author | Arechiga, Nikos | |
dc.contributor.author | Best, Andrew | |
dc.contributor.author | Ozay, Necmiye | |
dc.date.accessioned | 2024-05-31T15:00:17Z | |
dc.date.available | 2024-05-31T15:00:17Z | |
dc.date.issued | 2024-05-31 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/193507 | en |
dc.description.abstract | Inverse reinforcement Learning (IRL) has emerged as a powerful paradigm for extracting expert skills from observed behavior, with applications ranging from autonomous systems to humanrobot interaction. However, the identifiability issue within IRL poses a significant challenge, as multiple reward functions can explain the same observed behavior. This paper provides a linear algebraic characterization of several identifiability notions for an entropy-regularized finite horizon Markov decision process (MDP). Moreover, our approach allows for the seamless integration of prior knowledge, in the form of featurized reward functions, to enhance the identifiability of IRL problems. The results are demonstrated with experiments on a grid world environment | en_US |
dc.description.sponsorship | Toyota Research Institute (“TRI”), NSF grants CNS-1931982 and CNS-1918123. | en_US |
dc.language.iso | en_US | en_US |
dc.rights | Attribution 4.0 International | * |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.title | Learning True Objectives: Linear Algebraic Characterizations of Identifiability in Inverse Reinforcement Learning | en_US |
dc.type | Conference Paper | en_US |
dc.subject.hlbsecondlevel | Computer Science | |
dc.subject.hlbsecondlevel | Electrical Engineering | |
dc.subject.hlbtoplevel | Engineering | |
dc.description.peerreviewed | Peer Reviewed | en_US |
dc.contributor.affiliationumcampus | Ann Arbor | en_US |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/193507/1/Shehab163.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/23151 | |
dc.identifier.source | Learning for Decision and Control Conference (L4DC) 2024 | en_US |
dc.identifier.orcid | https://orcid.org/0000-0002-5552-4392 | en_US |
dc.identifier.orcid | https://orcid.org/0000-0003-3011-7122 | en_US |
dc.description.filedescription | Description of Shehab163.pdf : Main article | |
dc.description.depositor | SELF | en_US |
dc.identifier.name-orcid | Ozay, Necmiye; 0000-0002-5552-4392 | en_US |
dc.identifier.name-orcid | Aspeel, Antoine; 0000-0003-3011-7122 | en_US |
dc.owningcollname | Electrical Engineering and Computer Science, Department of (EECS) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.