Show simple item record

It's Data All the Way Down: Exploring the Relationship Between Machine Learning and Data Management

dc.contributor.authorAnderson, Michael
dc.date.accessioned2020-01-27T16:26:00Z
dc.date.availableNO_RESTRICTION
dc.date.available2020-01-27T16:26:00Z
dc.date.issued2019
dc.date.submitted2019
dc.identifier.urihttps://hdl.handle.net/2027.42/153458
dc.description.abstractData is central to machine learning: models are trained with data, trained models infer their predictions over input data, and the resulting inferences are themselves data. This being the case, there should be a natural relationship between machine learning and data management techniques. Much of machine learning research, perhaps understandably, focusses strictly on algorithmic improvements, chasing ever-increasing state-of-the-art accuracy measurements on their task of choice. Likewise, data management research has been slow to incorporate recent machine learning breakthroughs, like deep learning, to classic data management tasks. In this dissertation, we will demonstrate this relationship between machine learning and data management with a series of projects that improve aspects of machine learning through data management or improve data management with the addition of machine learning. Specifically, we detail two systems that use database-style methods to improve runtime issues traditionally associated with machine learning and a third project that uses recent machine learning methods to solve data quality issues. Our system Zombie shows that novel data indexing methods can greatly reduce the time needed to evaluate the effectiveness of feature engineering, thereby reducing the time needed to train accurate machine learning models. With our system Tahoma, we show that by using particular physical representations of the images used as input into convolutional neural network classifier cascades, content can be quickly extracted to support binary predicates used in a video analytics database. And our system Grover demonstrates that universal embeddings, like those used in computer vision or natural language processing, can be created for relational data, with both column and table embeddings used to improve the performance of data integration tasks. Our work shows machine learning and data management go hand-in-hand, and taking a holistic view of both can lead to improvements in each field.
dc.language.isoen_US
dc.subjectdatabase
dc.subjectmachine learning
dc.subjectdeep learning
dc.subjectdata management
dc.subjectfeature engineering
dc.subjectimage recognition
dc.titleIt's Data All the Way Down: Exploring the Relationship Between Machine Learning and Data Management
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineComputer Science & Engineering
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberCafarella, Michael John
dc.contributor.committeememberCollins-Thompson, Kevyn
dc.contributor.committeememberJagadish, Hosagrahar V
dc.contributor.committeememberWenisch, Thomas F
dc.subject.hlbsecondlevelComputer Science
dc.subject.hlbtoplevelEngineering
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/153458/1/mrander_1.pdf
dc.identifier.orcid0000-0002-0959-4234
dc.identifier.name-orcidAnderson, Michael; 0000-0002-0959-4234en_US
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.