Incorporating Provenance in Database Systems
dc.contributor.author | Chapman, Adriane P. | en_US |
dc.date.accessioned | 2009-02-05T19:27:06Z | |
dc.date.available | NO_RESTRICTION | en_US |
dc.date.available | 2009-02-05T19:27:06Z | |
dc.date.issued | 2008 | en_US |
dc.date.submitted | en_US | |
dc.identifier.uri | https://hdl.handle.net/2027.42/61645 | |
dc.description.abstract | The importance of maintaining provenance has been widely recognized. Currently there are two approaches: provenance generated within workflow frameworks, and provenance within a contained relational database. Workflow provenance allows workflow re-execution, and can offer some explanation of results. Within relational databases, knowledge of SQL queries and relational operators is used to express provenance. There is a disconnect between these two areas of provenance research. Techniques that work in relational databases cannot be applied to workflow systems because of heterogeneous data types and black-box operators. Meanwhile, the real-life utility of workflow systems has not been extended to database provenance. In the gap between provenance in workflow systems and databases, there are myriads of systems that need provenance. For instance, when creating a new dataset, like MiMI, using several sources and processes, or building an algorithm that generates sequence alignments, like MiBlast. These hybrid systems cannot be mashed into a workflow framework and do not solely exist within a database. This work solves issues that block provenance usage in hybrid systems. In particular, we look at capturing, storing, and using provenance information outside of workflow and database provenance systems. Database provenance and workflow systems provide no support for tracking the provenance of user actions, but manual effort is often a large component of effort in these hybrid systems. We describe an approach to track and record the user's actions in a queriable form. Once provenance is captured, storage can become prohibitively expensive, in both hybrid and workflow systems. We identify several techniques to reduce the provenance store. Additionally, usable provenance is a problem in workflow, database and hybrid provenance systems. Provenance contains both too much and too little information. We highlight the missing information that can assist user understanding, and develop a model of provenance answers to decrease information overload. Finally, workflow and database systems are designed to explain the results users see; they do not explain why items are not in the result. We allow researchers to specify what they are looking for and answer why it does not exist in the result set. | en_US |
dc.format.extent | 1872674 bytes | |
dc.format.extent | 1373 bytes | |
dc.format.mimetype | application/pdf | |
dc.format.mimetype | text/plain | |
dc.language.iso | en_US | en_US |
dc.subject | Provenance | en_US |
dc.subject | Lineage | en_US |
dc.subject | Algorithms | en_US |
dc.subject | Usability | en_US |
dc.subject | Provenance Capture | en_US |
dc.subject | Provenance Compression | en_US |
dc.title | Incorporating Provenance in Database Systems | en_US |
dc.type | Thesis | en_US |
dc.description.thesisdegreename | PhD | en_US |
dc.description.thesisdegreediscipline | Computer Science & Engineering | en_US |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | en_US |
dc.contributor.committeemember | Jagadish, Hosagrahar V. | en_US |
dc.contributor.committeemember | Ackerman, Mark Steven | en_US |
dc.contributor.committeemember | Patel, Jignesh M. | en_US |
dc.contributor.committeemember | States, David | en_US |
dc.subject.hlbsecondlevel | Computer Science | en_US |
dc.subject.hlbsecondlevel | Engineering (General) | en_US |
dc.subject.hlbtoplevel | Engineering | en_US |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/61645/1/apchapma_1.pdf | |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.