Generative AI-augmented and User-centric Research Data Discovery and Reuse

Fan, Lizhou

Generative AI-augmented and User-centric Research Data Discovery and Reuse

dc.contributor.author	Fan, Lizhou
dc.date.accessioned	2024-09-03T18:40:23Z
dc.date.available	2024-09-03T18:40:23Z
dc.date.issued	2024
dc.date.submitted	2024
dc.identifier.uri	https://hdl.handle.net/2027.42/194598
dc.description.abstract	This dissertation addresses the challenge of enhancing research data discovery and reuse in the face of escalating data volume and complexity. Traditional metadata-driven search tools often fall short in providing nuanced context and interdisciplinary connections critical for efficient scientific exploration and collaboration. To address these limitations, we developed the Generative AI-augmented and User-centric Data Search (GAUDS) system, which integrates Large Language Models (LLMs) and Scholarly Knowledge Graphs (SKGs) to parse natural language queries and visualize data relationships, thereby fostering a deeper understanding of available research resources. The study details the development and implementation of the GAUDS system, including the conceptualization of the guiding principles Connectivity, Effectiveness, Visibility and Interactivity (CEVI) that support and evaluate the discovery and reuse of research data. It further explores the construction of the ICPSR Health and Medical Scholarly Knowledge Graph (IHSKG), which represents complex connections in research data and prototypes interdisciplinary reuse potentials. The abilities of LLMs to perform complex reasoning were assessed, informing the system's ability to understand and manipulate large datasets effectively. The development of the GAUDS system, informed by insights gained from prototyping and evaluating user-centric utility, leads to a comprehensive analysis of focus group feedback. This feedback evaluates the system’s impact on enhancing data discoverability and usability. The GAUDS system, by providing effective navigation aids, relevant dataset suggestions, and contextualized reuse guides, not only enhances user engagement and satisfaction, but also demonstrates the transformative potential of generative AI in specialized academic domains such as health and medical research. This research contributes to the fields of information retrieval and data management by proposing a novel approach that combines human-curated knowledge graphs with generative AI algorithms to significantly improve data discovery and reuse. Future work will aim to productionize the GAUDS system, expand its scalability across different domains, and explore its broader potential to support open science initiatives.
dc.language.iso	en_US
dc.subject	data discovery
dc.subject	information retrieval
dc.subject	generative artificial intelligence
dc.subject	data reuse
dc.title	Generative AI-augmented and User-centric Research Data Discovery and Reuse
dc.type	Thesis
dc.description.thesisdegreename	PhD
dc.description.thesisdegreediscipline	Information
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Hemphill, Libby
dc.contributor.committeemember	Jagadish, H V
dc.contributor.committeemember	Gilliland, Anne
dc.contributor.committeemember	Levenstein, Maggie
dc.subject.hlbsecondlevel	Information and Library Science
dc.subject.hlbtoplevel	Social Sciences
dc.contributor.affiliationumcampus	Ann Arbor
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/194598/1/lizhouf_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/23946
dc.identifier.orcid	0000-0002-7962-9113
dc.identifier.name-orcid	Fan, Lizhou; 0000-0002-7962-9113	en_US
dc.working.doi	10.7302/23946	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: lizhouf_1.pdf
Size:: 33.56MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.