Active Learning in Non-parametric and Federated Settings
dc.contributor.author | Goetz, Jonathan | |
dc.date.accessioned | 2020-10-04T23:30:45Z | |
dc.date.available | NO_RESTRICTION | |
dc.date.available | 2020-10-04T23:30:45Z | |
dc.date.issued | 2020 | |
dc.date.submitted | 2020 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/163105 | |
dc.description.abstract | In many real world supervised learning problems, it is easy or cheap to acquire unlabelled data, but challenging or expensive to label it. Active learning aims to take advantage of this abundance of unlabelled data by sequentially selecting data points to label in an attempt to choose the best data points for the underlying prediction problem. In this thesis we present several contributions to the field of active learning. The first part examines active learning for regression, an under studied topic compared with classification. We consider active learning for non-parametric regression, a particularly challenging problem since it is known that under standard smoothness conditions, the minimax rates for active and passive learning are the same. None-the-less we provide an active learning algorithm with provable improvement over passive learning when our underlying estimator is a purely random decision tree. We experimentally confirm that the gains can be substantial, and provide guidance for practitioners. The second part returns to classification, but considers all weighted averaging estimators. Here we work to provide an extension of the celebrated Stone's Theorem for consistency under actively sampled data. We provide an augmentation that can be applied to a wide range of active learning algorithms, which allows us to replicate the results of Stone's Theorem in the noiseless case. However this only generalizes to the noisy case for some classical Stone estimators, whereas for others it can catastrophically fail. We explore the cause of this disjunctive behaviour and provide further conditions which exemplify why some estimators remain consistent while others do not. The final part addresses the emerging area of federated learning. We study the the problem of user selection during training, and expose the similarities to active learning. We then propose Active Federated Learning, which adapts techniques from active learning to this new setting, and show that the method can lead to reductions in the communication costs of training federated models by 20-70%. | |
dc.language.iso | en_US | |
dc.subject | Active Learning | |
dc.subject | Federated Learning | |
dc.title | Active Learning in Non-parametric and Federated Settings | |
dc.type | Thesis | |
dc.description.thesisdegreename | PhD | en_US |
dc.description.thesisdegreediscipline | Statistics | |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | |
dc.contributor.committeemember | Tewari, Ambuj | |
dc.contributor.committeemember | Zimmerman, Paul | |
dc.contributor.committeemember | Nguyen, Long | |
dc.contributor.committeemember | Ritov, Yaacov | |
dc.subject.hlbsecondlevel | Statistics and Numeric Data | |
dc.subject.hlbtoplevel | Science | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/163105/1/jrgoetz_1.pdf | en_US |
dc.identifier.orcid | 0000-0002-9954-9460 | |
dc.identifier.name-orcid | Goetz, Jack; 0000-0002-9954-9460 | en_US |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.