Show simple item record

Modeling Simplex-valued Data and Latent Structures

dc.contributor.authorLei, Rayleigh
dc.date.accessioned2022-09-06T15:59:17Z
dc.date.available2022-09-06T15:59:17Z
dc.date.issued2022
dc.date.submitted2022
dc.identifier.urihttps://hdl.handle.net/2027.42/174205
dc.description.abstractExamples of data that lie in a simplex abound in a variety of fields. The chemical, mineral, and/or fossil percentages of a collection of rocks is of interest to geologists. Demographers and policy makers might examine the income proportions or racial compositions of neighborhoods or other political units. However, it can be challenging to model this type of data, particularly because each observation must sum up to one. If not handled correctly, this might induce what Karl Pearson called "spurious correlation". As a trivial example of this, take an observation that has two values and sums up to one. Then, the values must be negatively correlated with each other. While traditional approaches for these type of data involve analyzing the log ratio transforms, this might be problematic if any of the observations have a zero as one of their values or for interpretation. Additional challenges arise if this data changes over time. The first two chapters lay the framework for how such data may be modeled. The second chapter proposes doing so using a general affine transformation for the overall change and a sufficiently rich error model for the difference between the overall transformation and the observation at the next time point. Of the three models explored, the rotational geodesic error model is most promising. However, it might not be appropriate to assume that the direction observations moved in is uniformly distributed. Using ideas from directional statistics, we discuss in the third chapter how to model directions that appear to be similar for observations with similar values. In both chapters, we run simulation studies and analyze the income proportions from Los Angeles County. In each case, our analysis is able to discover trends consistent with larger macroeconomic ones and provide further details about these trends. The last chapter discusses tree-based mixtures of probability simplices. In other words, the simplices share vertices in a way that can be represented by a tree with the root node corresponding to a vertex shared by all simplices and the leaf nodes corresponding to vertices present in one simplex. We show when such models have posterior consistency and demonstrate how to efficiently fit them using geometric methods. Indeed, we apply them to analyze a subset of articles from the New York Times, uncovering meaningful topics and interesting semantic relationships between these topics. While we leave it to future work, these methods might also be combined with the ones from previous chapters to model how sub-regions of data that lie in a simplex change over time.
dc.language.isoen_US
dc.subjectBayesian statistics
dc.subjectBayesian modeling
dc.subjectBayesian computation
dc.subjectSimplex
dc.titleModeling Simplex-valued Data and Latent Structures
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineStatistics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberNguyen, Long
dc.contributor.committeememberFeinberg, Fred
dc.contributor.committeememberChen, Yang
dc.contributor.committeememberRegier, Jeffrey
dc.subject.hlbsecondlevelStatistics and Numeric Data
dc.subject.hlbtoplevelScience
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/174205/1/rayleigh_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/5936
dc.identifier.orcid0000-0002-0444-9708
dc.identifier.name-orcidLei, Rayleigh; 0000-0002-0444-9708en_US
dc.working.doi10.7302/5936en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.