New Directions in Online Learning: Boosting, Partial Information, and Non-Stationarity

Jung, Young Hun

New Directions in Online Learning: Boosting, Partial Information, and Non-Stationarity

dc.contributor.author	Jung, Young Hun
dc.date.accessioned	2020-05-08T14:33:56Z
dc.date.available	NO_RESTRICTION
dc.date.available	2020-05-08T14:33:56Z
dc.date.issued	2020
dc.date.submitted	2020
dc.identifier.uri	https://hdl.handle.net/2027.42/155110
dc.description.abstract	Online learning, where a learning algorithm fits a model on-the-fly with streaming data, has become an important research area in machine learning. Batch learning, where the entire data set has to be available to the learning algorithm, is not always a suitable paradigm for the big data era. It is increasingly common in many practical situations, such as online ads prediction or control of self-driving cars, that data instances naturally arrive in a sequential manner. In these situations, researchers want to update their model in an online fashion. This dissertation pursues several topics at the frontier of online learning research. In Chapter 2 and Chapter 3, the journey starts with online boosting. Online boosting studies how to combine multiple online weak learners to get a stronger learner. Chapter 2 considers online multi-class classification problems. Chapter 3 focuses on the more challenging multi-label ranking problem where there are multiple correct labels and the learner outputs a ranking of labels based on their relevance. In both chapters, an optimal algorithm and an adaptive algorithm are proposed. The optimal algorithms require a minimal number of weak learners to attain the desired accuracy. The adaptive algorithms are practically more useful since they do not require a priori knowledge about the strength of weak learners and are more computationally efficient. The adaptive algorithms are not statistically optimal but they still come with reasonable performance guarantees. The empirical results on real data sets support the theoretical findings and the proposed boosting algorithms outperformed existing competitors on benchmark data sets. Chapter 4 considers the partial information setting, where the learner does not receive the true labels. Partial feedback is common in practice as obtaining complete feedback can be costly. The chapter revisits the boosting algorithms that are presented in Chapter 2 and Chapter 3 and extends them to work with partial information feedback. Despite the learner receiving much less information, comparable performance guarantees can be made. Later in Chapter 5 and Chapter 6, we move on to another interesting area in online learning called restless bandit problems. Unlike the classical (stochastic) multi-armed bandit problems where the reward distributions are unknown but stationary, in restless bandit problems the distributions can change over time. This extra layer of complexity allows us to study more complicated models, but the analysis becomes even more difficult. In restless bandit problems, it is assumed that each arm has a state that evolves according to an unknown Markov process, and the reward distribution depends on the arm's current state. This setting can be thought of as a sub-class of reinforcement learning and the partial observability inherent in this problem makes the analysis very challenging. The well known Thompson Sampling algorithm is analyzed and a Bayesian regret bound for it is derived. Chapter 5 considers the episodic case where the system periodically resets. Chapter 6 extends the analysis to the more challenging non-episodic (i.e., infinite time horizon) case. In both settings, Thompson Sampling algorithms (with slight modifications) enjoy sub-linear regret bounds, and the empirical results on simulated data support this fact. The experiments also suggest the possibility that the algorithm can be used in the frequentist setting even though the theoretical bounds are only shown for the Bayesian regret.
dc.language.iso	en_US
dc.subject	Online Learning
dc.subject	Boosting
dc.subject	Partial Feedback
dc.subject	Restless Bandits
dc.subject	Thompson Sampling
dc.title	New Directions in Online Learning: Boosting, Partial Information, and Non-Stationarity
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Statistics
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Tewari, Ambuj
dc.contributor.committeemember	Scott, Clayton D
dc.contributor.committeemember	Nguyen, Long
dc.contributor.committeemember	Zhu, Ji
dc.subject.hlbsecondlevel	Statistics and Numeric Data
dc.subject.hlbtoplevel	Science
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/155110/1/yhjung_1.pdf
dc.identifier.orcid	0000-0003-1625-4526
dc.identifier.name-orcid	Jung, Young Hun; 0000-0003-1625-4526	en_US
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: yhjung_1.pdf
Size:: 972.7KB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.