Non-Asymptotic Adaptive Control of Linear-Quadratic Systems
dc.contributor.author | Shirani Faradonbeh, Mohamad Kazem | |
dc.date.accessioned | 2018-01-31T18:19:35Z | |
dc.date.available | NO_RESTRICTION | |
dc.date.available | 2018-01-31T18:19:35Z | |
dc.date.issued | 2017 | |
dc.date.submitted | 2017 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/140882 | |
dc.description.abstract | Optimal control for the canonical model of systems with linear dynamics and quadratic operating costs (known as LQ systems) is a well-studied problem in the stochastic control literature. When the true system dynamics are unknown, an adaptive policy is required for learning the model parameters and planning a control policy simultaneously. Addressing this trade-off between accurate estimation and good control represents the main challenge in area of adaptive control. Another important issue is to prevent the system becoming destabilized (in the sense that its state grows in an uncontrolled fashion) due to lack of knowledge of the system dynamics. Asymptotically optimal approaches have been thoroughly investigated in the literature, but non-asymptotic results are few and rather incomplete. To derive such results, new concepts and technical tools need to be developed for the estimation during the stabilization period of the system. In adaptive control, the system performance is measured by the regret, which is the difference between the cost of the adaptive policy and that of the optimal control designed according to the known dynamics. In this work, we establish non-asymptotic high probability regret bounds, which are modulo a logarithmic factor, optimal, for different LQ systems with and without identifiability assumptions. We also provide high probability guarantees for a stabilization algorithm based on random linear feedbacks. The results obtained are fairly general, since the assumptions needed are those of: (i) stabilizability of the matrices encoding the system's dynamical, and (ii) on the heaviness of the distribution for the noise vectors. The study provides also novel results regarding the estimation of the parameters for presumably unstable Vector Autoregressive (VAR) models. In the classical literature, there are hardly any results for the unstable case, especially regarding finite sample bounds, that is the subject of this work. Our results relate the sample size required as a function of the problem dimension and key characteristics of the true underlying transition matrix and the innovation distribution. To obtain them, appropriate concentration inequalities for random matrices and for sequences of martingale differences are leveraged. | |
dc.language.iso | en_US | |
dc.subject | Non-Asymptotic Adaptive Control | |
dc.subject | Linear Systems | |
dc.subject | Finite Time Stabilization | |
dc.subject | Reinforcement Learning | |
dc.subject | Unstable Vector Autoregressive | |
dc.subject | Finite Sample Estimation | |
dc.title | Non-Asymptotic Adaptive Control of Linear-Quadratic Systems | |
dc.type | Thesis | en_US |
dc.description.thesisdegreename | PhD | en_US |
dc.description.thesisdegreediscipline | Statistics | |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | |
dc.contributor.committeemember | Michailidis, George | |
dc.contributor.committeemember | Tewari, Ambuj | |
dc.contributor.committeemember | Teneketzis, Demosthenis | |
dc.contributor.committeemember | Keener, Robert W | |
dc.subject.hlbsecondlevel | Computer Science | |
dc.subject.hlbsecondlevel | Electrical Engineering | |
dc.subject.hlbsecondlevel | Engineering (General) | |
dc.subject.hlbsecondlevel | Industrial and Operations Engineering | |
dc.subject.hlbsecondlevel | Mathematics | |
dc.subject.hlbsecondlevel | Statistics and Numeric Data | |
dc.subject.hlbtoplevel | Engineering | |
dc.subject.hlbtoplevel | Science | |
dc.description.bitstreamurl | https://deepblue.lib.umich.edu/bitstream/2027.42/140882/1/shirany_1.pdf | |
dc.identifier.orcid | 0000-0002-3807-5919 | |
dc.identifier.name-orcid | Shirani Faradonbeh, Mohamad Kazem; 0000-0002-3807-5919 | en_US |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.