Axiomatic Analysis of Unsupervised Diversity on Large-Scale High-dimensional Data

Yan, Shiyan

Axiomatic Analysis of Unsupervised Diversity on Large-Scale High-dimensional Data

dc.contributor.author	Yan, Shiyan
dc.date.accessioned	2021-09-24T19:08:53Z
dc.date.available	2021-09-24T19:08:53Z
dc.date.issued	2021
dc.date.submitted	2021
dc.identifier.uri	https://hdl.handle.net/2027.42/169733
dc.description.abstract	Diversity is a concept widely used in every corner of our society. It represents the "breadth" of a set of objects, which needs to be promoted or reduced in different scenarios. Though many people have discussed it, how to define diversity in a reliable way is still a non-trivial task. In particular, when we are facing large-scale high-dimensional data, it is impossible to use pre-defined classifications to divide each object into categories and utilize diversity measurements in downstream tasks. An unsupervised methodology is necessary to handle this challenge. In this dissertation, I explore different methods to address the research question: how to measure diversity in an unsupervised manner based on large-scale high-dimensional data. I leverage representation learning algorithms to project objects into a discrete or continuous space and design several metrics to measure diversity in real-world applications. Furthermore, I introduce an axiomatic analysis method to help us choose and evaluate diversity metrics in both discrete and continuous settings. Following the guidelines derived from the axiomatic analysis, I define diversity in terms of metrics to map distributions of topics to real numbers in discrete space. I also find a simple and intuitive metric to measure diversity, which is defined in continuous space, that performs surprisingly well to satisfy different axioms. The sound and reliable metrics motivate me to focus on some controversial research topics in real applications. I explore the effect of research diversity i.e., how broad researchers' research interests are. I conduct several studies to figure out whether publishing papers with high diversity results in greater research impact. Furthermore, I track trajectories of researchers' careers and try to find the effects of research diversity at different stages. Another real-world application appears in online social networks. Structural diversity, the closeness of users' friends, has a substantial influence on users' behavior from many perspectives. I define users' structural diversity using the results of axiomatic analysis. I track the pattern within the variation in structural diversity in both static and dynamic networks and simulate it with an intuitive graph generation algorithm. An interesting pattern of structural diversity and user engagement in online social media is illustrated.
dc.language.iso	en_US
dc.subject	axiomatic analysis
dc.subject	metric design
dc.subject	research diversity
dc.subject	social network
dc.subject	graph generation
dc.title	Axiomatic Analysis of Unsupervised Diversity on Large-Scale High-dimensional Data
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Information
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Mei, Qiaozhu
dc.contributor.committeemember	Vydiswaran, VG Vinod
dc.contributor.committeemember	Romero, Daniel M
dc.contributor.committeemember	Teplitskiy, Misha
dc.subject.hlbsecondlevel	Computer Science
dc.subject.hlbsecondlevel	Information and Library Science
dc.subject.hlbtoplevel	Engineering
dc.subject.hlbtoplevel	Social Sciences
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/169733/1/shiyansi_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/2778
dc.identifier.orcid	0000-0002-3264-149X
dc.identifier.name-orcid	Yan, Shiyan; 0000-0002-3264-149X	en_US
dc.working.doi	10.7302/2778	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: shiyansi_1.pdf
Size:: 2.940MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.