Active Learning in Non-parametric and Federated Settings

Goetz, Jonathan

Active Learning in Non-parametric and Federated Settings

dc.contributor.author	Goetz, Jonathan
dc.date.accessioned	2020-10-04T23:30:45Z
dc.date.available	NO_RESTRICTION
dc.date.available	2020-10-04T23:30:45Z
dc.date.issued	2020
dc.date.submitted	2020
dc.identifier.uri	https://hdl.handle.net/2027.42/163105
dc.description.abstract	In many real world supervised learning problems, it is easy or cheap to acquire unlabelled data, but challenging or expensive to label it. Active learning aims to take advantage of this abundance of unlabelled data by sequentially selecting data points to label in an attempt to choose the best data points for the underlying prediction problem. In this thesis we present several contributions to the field of active learning. The first part examines active learning for regression, an under studied topic compared with classification. We consider active learning for non-parametric regression, a particularly challenging problem since it is known that under standard smoothness conditions, the minimax rates for active and passive learning are the same. None-the-less we provide an active learning algorithm with provable improvement over passive learning when our underlying estimator is a purely random decision tree. We experimentally confirm that the gains can be substantial, and provide guidance for practitioners. The second part returns to classification, but considers all weighted averaging estimators. Here we work to provide an extension of the celebrated Stone's Theorem for consistency under actively sampled data. We provide an augmentation that can be applied to a wide range of active learning algorithms, which allows us to replicate the results of Stone's Theorem in the noiseless case. However this only generalizes to the noisy case for some classical Stone estimators, whereas for others it can catastrophically fail. We explore the cause of this disjunctive behaviour and provide further conditions which exemplify why some estimators remain consistent while others do not. The final part addresses the emerging area of federated learning. We study the the problem of user selection during training, and expose the similarities to active learning. We then propose Active Federated Learning, which adapts techniques from active learning to this new setting, and show that the method can lead to reductions in the communication costs of training federated models by 20-70%.
dc.language.iso	en_US
dc.subject	Active Learning
dc.subject	Federated Learning
dc.title	Active Learning in Non-parametric and Federated Settings
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Statistics
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Tewari, Ambuj
dc.contributor.committeemember	Zimmerman, Paul
dc.contributor.committeemember	Nguyen, Long
dc.contributor.committeemember	Ritov, Yaacov
dc.subject.hlbsecondlevel	Statistics and Numeric Data
dc.subject.hlbtoplevel	Science
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/163105/1/jrgoetz_1.pdf	en_US
dc.identifier.orcid	0000-0002-9954-9460
dc.identifier.name-orcid	Goetz, Jack; 0000-0002-9954-9460	en_US
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: jrgoetz_1.pdf
Size:: 925.7KB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.