Show simple item record

Axiomatic Analysis of Unsupervised Diversity on Large-Scale High-dimensional Data

dc.contributor.authorYan, Shiyan
dc.date.accessioned2021-09-24T19:08:53Z
dc.date.available2021-09-24T19:08:53Z
dc.date.issued2021
dc.date.submitted2021
dc.identifier.urihttps://hdl.handle.net/2027.42/169733
dc.description.abstractDiversity is a concept widely used in every corner of our society. It represents the "breadth" of a set of objects, which needs to be promoted or reduced in different scenarios. Though many people have discussed it, how to define diversity in a reliable way is still a non-trivial task. In particular, when we are facing large-scale high-dimensional data, it is impossible to use pre-defined classifications to divide each object into categories and utilize diversity measurements in downstream tasks. An unsupervised methodology is necessary to handle this challenge. In this dissertation, I explore different methods to address the research question: how to measure diversity in an unsupervised manner based on large-scale high-dimensional data. I leverage representation learning algorithms to project objects into a discrete or continuous space and design several metrics to measure diversity in real-world applications. Furthermore, I introduce an axiomatic analysis method to help us choose and evaluate diversity metrics in both discrete and continuous settings. Following the guidelines derived from the axiomatic analysis, I define diversity in terms of metrics to map distributions of topics to real numbers in discrete space. I also find a simple and intuitive metric to measure diversity, which is defined in continuous space, that performs surprisingly well to satisfy different axioms. The sound and reliable metrics motivate me to focus on some controversial research topics in real applications. I explore the effect of research diversity i.e., how broad researchers' research interests are. I conduct several studies to figure out whether publishing papers with high diversity results in greater research impact. Furthermore, I track trajectories of researchers' careers and try to find the effects of research diversity at different stages. Another real-world application appears in online social networks. Structural diversity, the closeness of users' friends, has a substantial influence on users' behavior from many perspectives. I define users' structural diversity using the results of axiomatic analysis. I track the pattern within the variation in structural diversity in both static and dynamic networks and simulate it with an intuitive graph generation algorithm. An interesting pattern of structural diversity and user engagement in online social media is illustrated.
dc.language.isoen_US
dc.subjectaxiomatic analysis
dc.subjectmetric design
dc.subjectresearch diversity
dc.subjectsocial network
dc.subjectgraph generation
dc.titleAxiomatic Analysis of Unsupervised Diversity on Large-Scale High-dimensional Data
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineInformation
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberMei, Qiaozhu
dc.contributor.committeememberVydiswaran, VG Vinod
dc.contributor.committeememberRomero, Daniel M
dc.contributor.committeememberTeplitskiy, Misha
dc.subject.hlbsecondlevelComputer Science
dc.subject.hlbsecondlevelInformation and Library Science
dc.subject.hlbtoplevelEngineering
dc.subject.hlbtoplevelSocial Sciences
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/169733/1/shiyansi_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/2778
dc.identifier.orcid0000-0002-3264-149X
dc.identifier.name-orcidYan, Shiyan; 0000-0002-3264-149Xen_US
dc.working.doi10.7302/2778en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.