Efficient Game Solving Through Transfer Learning

Smith, Max

Efficient Game Solving Through Transfer Learning

dc.contributor.author	Smith, Max
dc.date.accessioned	2023-09-22T15:42:19Z
dc.date.available	2023-09-22T15:42:19Z
dc.date.issued	2023
dc.date.submitted	2023
dc.identifier.uri	https://hdl.handle.net/2027.42/178077
dc.description.abstract	Game-solving approaches using reinforcement learning often entail a significant computational cost. This arises from the necessity of training agents to play with or against a series of other-agent strategies. Each round of training brings us closer to the game's solution, but training an agent can require data from millions of games played---typically in simulation. The cost of game solving reflects the cumulative data cost of repeatedly training agents. This cost is also a result of treating each training as an independent problem. However, these problems share elements that reflect the nature of the game-solving process. These similarities present an opportunity for an agent to transfer learning from previous problems to aid in solving the current problem. In this dissertation, I develop a collection of new game-solving algorithms that are based on new methods for transfer learning, thereby reducing the computational cost of game solving. I explore two types of transferable knowledge: strategic and world. Strategic knowledge describes knowledge that depends on the other agents. In the simplest case, strategic knowledge may be encapsulated in a policy that was trained to play, with or against, fixed other agents. To facilitate the transfer of this kind of strategic knowledge, I propose Q-Mixing, a technique that constructs a policy to play against a distribution of other agents by combining strategic knowledge regarding each agent in the distribution. I provide a practical approximate version of Q-Mixing that features another type of strategic knowledge: a learned belief in the distribution of the other agents. I then develop two game-solving algorithms, Mixed-Oracles and Mixed-Opponents. These algorithms use Q-Mixing to shift the learning focus from interacting with a distribution of other agents to concentrating on a single other agent. This transition results in a significantly easier and, therefore, less costly learning problem. Complementary to strategic knowledge, world knowledge is independent of the other agents. I demonstrate that co-learning a world model along with game solving allows the world model to benefit from more strategically diverse training data. It also renders game solving more affordable through planning. I realize both of these benefits in a new game-solving algorithm Dyna-PSRO@. Overall, this dissertation introduces new techniques and demonstrates their effectiveness in significantly reducing the cost of game solving. By doing so, it further enables learning-based game-solving algorithms to be applied to more complex games.
dc.language.iso	en_US
dc.subject	Reinforcement Learning
dc.subject	Multiagent Learning
dc.title	Efficient Game Solving Through Transfer Learning
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Computer Science & Engineering
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Wellman, Michael P
dc.contributor.committeemember	Schoenebeck, Grant
dc.contributor.committeemember	Baveja, Satinder Singh
dc.contributor.committeemember	Lee, Honglak
dc.subject.hlbsecondlevel	Computer Science
dc.subject.hlbtoplevel	Engineering
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/178077/1/mxsmith_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/8534
dc.working.doi	10.7302/8534	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: mxsmith_1.pdf
Size:: 4.666MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.