Show simple item record

Efficient Game Solving Through Transfer Learning

dc.contributor.authorSmith, Max
dc.date.accessioned2023-09-22T15:42:19Z
dc.date.available2023-09-22T15:42:19Z
dc.date.issued2023
dc.date.submitted2023
dc.identifier.urihttps://hdl.handle.net/2027.42/178077
dc.description.abstractGame-solving approaches using reinforcement learning often entail a significant computational cost. This arises from the necessity of training agents to play with or against a series of other-agent strategies. Each round of training brings us closer to the game's solution, but training an agent can require data from millions of games played---typically in simulation. The cost of game solving reflects the cumulative data cost of repeatedly training agents. This cost is also a result of treating each training as an independent problem. However, these problems share elements that reflect the nature of the game-solving process. These similarities present an opportunity for an agent to transfer learning from previous problems to aid in solving the current problem. In this dissertation, I develop a collection of new game-solving algorithms that are based on new methods for transfer learning, thereby reducing the computational cost of game solving. I explore two types of transferable knowledge: strategic and world. Strategic knowledge describes knowledge that depends on the other agents. In the simplest case, strategic knowledge may be encapsulated in a policy that was trained to play, with or against, fixed other agents. To facilitate the transfer of this kind of strategic knowledge, I propose Q-Mixing, a technique that constructs a policy to play against a distribution of other agents by combining strategic knowledge regarding each agent in the distribution. I provide a practical approximate version of Q-Mixing that features another type of strategic knowledge: a learned belief in the distribution of the other agents. I then develop two game-solving algorithms, Mixed-Oracles and Mixed-Opponents. These algorithms use Q-Mixing to shift the learning focus from interacting with a distribution of other agents to concentrating on a single other agent. This transition results in a significantly easier and, therefore, less costly learning problem. Complementary to strategic knowledge, world knowledge is independent of the other agents. I demonstrate that co-learning a world model along with game solving allows the world model to benefit from more strategically diverse training data. It also renders game solving more affordable through planning. I realize both of these benefits in a new game-solving algorithm Dyna-PSRO@. Overall, this dissertation introduces new techniques and demonstrates their effectiveness in significantly reducing the cost of game solving. By doing so, it further enables learning-based game-solving algorithms to be applied to more complex games.
dc.language.isoen_US
dc.subjectReinforcement Learning
dc.subjectMultiagent Learning
dc.titleEfficient Game Solving Through Transfer Learning
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineComputer Science & Engineering
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberWellman, Michael P
dc.contributor.committeememberSchoenebeck, Grant
dc.contributor.committeememberBaveja, Satinder Singh
dc.contributor.committeememberLee, Honglak
dc.subject.hlbsecondlevelComputer Science
dc.subject.hlbtoplevelEngineering
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/178077/1/mxsmith_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/8534
dc.working.doi10.7302/8534en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.