Show simple item record

Flexible query facilities for heterogeneous semi-structured data.

dc.contributor.authorLi, Yunyao
dc.contributor.advisorJagadish, Hosagrahar V.
dc.date.accessioned2016-08-30T16:14:56Z
dc.date.available2016-08-30T16:14:56Z
dc.date.issued2007
dc.identifier.urihttp://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:3253332
dc.identifier.urihttps://hdl.handle.net/2027.42/126485
dc.description.abstractThis dissertation studies flexible query facilities for semi-structured data in a heterogeneous environment, with a focus on XML databases. The popularity of XML naturally follows from the needs of querying XML documents from a wide spectrum of users. Although formal database query languages such as XQuery can provide precise access to XML data, the challenges of querying XML using such rigid formal database languages---i.e., requiring users to have perfect knowledge of database schema, query language syntax and query semantics---leads to the requests for flexible yet accurate query facilities over XML documents. This dissertation discusses a two-part solution for supporting flexible queries over XML documents: (a) Schema-Free XQuery that allows database queries to be specified with limited or even no schema knowledge and (b) Natural Language Interface for Querying XML (NaLIX) that can translate database queries with complex semantics in plain English into Schema-Free XQuery expressions. NaLIX enables users to pose complex database queries in plain English without knowing any formal query language or underlying database schema. Iterative user search is also supported in NaLIX by allowing queries to be stated with respect to previous queries. NaLIX does not depend on any domain knowledge. However, it can be improved further by automatically learning domain information. We also present a novel stack-based algorithm and cost-based optimization techniques to allow these techniques, Schema-Free XQuery and NaLIX, to be implemented efficiently. In addition, we report experimental results that validate the proposed solution. Finally, we discuss how our solution improves the state-of-art through comparison with previous work.
dc.format.extent163 p.
dc.languageEnglish
dc.language.isoEN
dc.subjectData
dc.subjectFacilities
dc.subjectFlexible
dc.subjectHeterogeneous
dc.subjectNatural Language Interface
dc.subjectQuery
dc.subjectSchema-free
dc.subjectSemi
dc.subjectSemistructured
dc.subjectStructured
dc.titleFlexible query facilities for heterogeneous semi-structured data.
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineApplied Sciences
dc.description.thesisdegreedisciplineComputer science
dc.description.thesisdegreedisciplineExperimental psychology
dc.description.thesisdegreedisciplineLanguage, Literature and Linguistics
dc.description.thesisdegreedisciplineLinguistics
dc.description.thesisdegreedisciplinePsychology
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/126485/2/3253332.pdf
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.