Learning in dynamic noncooperative multiagent systems.

Hu, Junling

Learning in dynamic noncooperative multiagent systems.

dc.contributor.author	Hu, Junling
dc.contributor.advisor	Wellman, Michael P.
dc.date.accessioned	2016-08-30T17:54:49Z
dc.date.available	2016-08-30T17:54:49Z
dc.date.issued	1999
dc.identifier.uri	http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:9938451
dc.identifier.uri	https://hdl.handle.net/2027.42/131906
dc.description.abstract	Dynamic noncooperative multiagent systems are systems where self-interested agents interact with each other and their interactions change over time. We investigate the problem of learning and decision making in such systems. We model the systems in the framework of general-sum stochastic games with incomplete information. We design a multiagent Q-learning method, and prove its convergence in the framework of stochastic games. The standard Q-learning method, a reinforcement learning method, was originally designed for single-agent systems. Its convergence was proved for Markov decision processes, which are single-agent problems. Our extension broadens the framework of reinforcement learning, and helps to establish the theoretical foundation for applying it to multiagent systems. We prove that our learning algorithm converges to a Nash equilibrium under certain restrictions on the game structure during learning. In our simulations of a grid-world game, the restrictions are relaxed and our learning method still converges. In addition to model-free reinforcement learning, we have also studied model-based learning where agents form models of others and update their models through observations of the environment. We find that agents' mutual learning can lead to a conjectural equilibrium, where the agents' models of the others are fulfilled, and each agent behaves optimally given its expectation. Such an equilibrium state may be suboptimal. The agents may be worse off than had they not attempted to learn the models of others at all. This poses a pitfall for multiagent learning. We also analyzed the problem of recursive modeling in a dynamic game framework. This differs from previous work which studied recursive modeling in static or repeated games. We implement various levels of recursive model in a simulated double auction market. Our experiments show that performance of an agent can be quite sensitive to its assumptions about the policies of other agents, and when there is substantial uncertainty about the level of sophistication of other agents, reducing the level of recursion might be the best policy.
dc.format.extent	141 p.
dc.language	English
dc.language.iso	EN
dc.subject	Decision Making
dc.subject	Decision-making
dc.subject	Dynamic
dc.subject	Multiagen
dc.subject	Noncooperative Multiagent Systems
dc.subject	Q-learning
dc.title	Learning in dynamic noncooperative multiagent systems.
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Applied Sciences
dc.description.thesisdegreediscipline	Computer science
dc.description.thesisdegreediscipline	Economics
dc.description.thesisdegreediscipline	Operations research
dc.description.thesisdegreediscipline	Social Sciences
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/131906/2/9938451.pdf
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: 9938451.pdf
Size:: 5.960MB
Format:: PDF
Description:: Access Restricted to UM users only.

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.