Axiomatic Analysis of Unsupervised Diversity on Large-Scale High-dimensional Data
dc.contributor.author | Yan, Shiyan | |
dc.date.accessioned | 2021-09-24T19:08:53Z | |
dc.date.available | 2021-09-24T19:08:53Z | |
dc.date.issued | 2021 | |
dc.date.submitted | 2021 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/169733 | |
dc.description.abstract | Diversity is a concept widely used in every corner of our society. It represents the "breadth" of a set of objects, which needs to be promoted or reduced in different scenarios. Though many people have discussed it, how to define diversity in a reliable way is still a non-trivial task. In particular, when we are facing large-scale high-dimensional data, it is impossible to use pre-defined classifications to divide each object into categories and utilize diversity measurements in downstream tasks. An unsupervised methodology is necessary to handle this challenge. In this dissertation, I explore different methods to address the research question: how to measure diversity in an unsupervised manner based on large-scale high-dimensional data. I leverage representation learning algorithms to project objects into a discrete or continuous space and design several metrics to measure diversity in real-world applications. Furthermore, I introduce an axiomatic analysis method to help us choose and evaluate diversity metrics in both discrete and continuous settings. Following the guidelines derived from the axiomatic analysis, I define diversity in terms of metrics to map distributions of topics to real numbers in discrete space. I also find a simple and intuitive metric to measure diversity, which is defined in continuous space, that performs surprisingly well to satisfy different axioms. The sound and reliable metrics motivate me to focus on some controversial research topics in real applications. I explore the effect of research diversity i.e., how broad researchers' research interests are. I conduct several studies to figure out whether publishing papers with high diversity results in greater research impact. Furthermore, I track trajectories of researchers' careers and try to find the effects of research diversity at different stages. Another real-world application appears in online social networks. Structural diversity, the closeness of users' friends, has a substantial influence on users' behavior from many perspectives. I define users' structural diversity using the results of axiomatic analysis. I track the pattern within the variation in structural diversity in both static and dynamic networks and simulate it with an intuitive graph generation algorithm. An interesting pattern of structural diversity and user engagement in online social media is illustrated. | |
dc.language.iso | en_US | |
dc.subject | axiomatic analysis | |
dc.subject | metric design | |
dc.subject | research diversity | |
dc.subject | social network | |
dc.subject | graph generation | |
dc.title | Axiomatic Analysis of Unsupervised Diversity on Large-Scale High-dimensional Data | |
dc.type | Thesis | |
dc.description.thesisdegreename | PhD | en_US |
dc.description.thesisdegreediscipline | Information | |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | |
dc.contributor.committeemember | Mei, Qiaozhu | |
dc.contributor.committeemember | Vydiswaran, VG Vinod | |
dc.contributor.committeemember | Romero, Daniel M | |
dc.contributor.committeemember | Teplitskiy, Misha | |
dc.subject.hlbsecondlevel | Computer Science | |
dc.subject.hlbsecondlevel | Information and Library Science | |
dc.subject.hlbtoplevel | Engineering | |
dc.subject.hlbtoplevel | Social Sciences | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/169733/1/shiyansi_1.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/2778 | |
dc.identifier.orcid | 0000-0002-3264-149X | |
dc.identifier.name-orcid | Yan, Shiyan; 0000-0002-3264-149X | en_US |
dc.working.doi | 10.7302/2778 | en |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.