Modeling Simplex-valued Data and Latent Structures
dc.contributor.author | Lei, Rayleigh | |
dc.date.accessioned | 2022-09-06T15:59:17Z | |
dc.date.available | 2022-09-06T15:59:17Z | |
dc.date.issued | 2022 | |
dc.date.submitted | 2022 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/174205 | |
dc.description.abstract | Examples of data that lie in a simplex abound in a variety of fields. The chemical, mineral, and/or fossil percentages of a collection of rocks is of interest to geologists. Demographers and policy makers might examine the income proportions or racial compositions of neighborhoods or other political units. However, it can be challenging to model this type of data, particularly because each observation must sum up to one. If not handled correctly, this might induce what Karl Pearson called "spurious correlation". As a trivial example of this, take an observation that has two values and sums up to one. Then, the values must be negatively correlated with each other. While traditional approaches for these type of data involve analyzing the log ratio transforms, this might be problematic if any of the observations have a zero as one of their values or for interpretation. Additional challenges arise if this data changes over time. The first two chapters lay the framework for how such data may be modeled. The second chapter proposes doing so using a general affine transformation for the overall change and a sufficiently rich error model for the difference between the overall transformation and the observation at the next time point. Of the three models explored, the rotational geodesic error model is most promising. However, it might not be appropriate to assume that the direction observations moved in is uniformly distributed. Using ideas from directional statistics, we discuss in the third chapter how to model directions that appear to be similar for observations with similar values. In both chapters, we run simulation studies and analyze the income proportions from Los Angeles County. In each case, our analysis is able to discover trends consistent with larger macroeconomic ones and provide further details about these trends. The last chapter discusses tree-based mixtures of probability simplices. In other words, the simplices share vertices in a way that can be represented by a tree with the root node corresponding to a vertex shared by all simplices and the leaf nodes corresponding to vertices present in one simplex. We show when such models have posterior consistency and demonstrate how to efficiently fit them using geometric methods. Indeed, we apply them to analyze a subset of articles from the New York Times, uncovering meaningful topics and interesting semantic relationships between these topics. While we leave it to future work, these methods might also be combined with the ones from previous chapters to model how sub-regions of data that lie in a simplex change over time. | |
dc.language.iso | en_US | |
dc.subject | Bayesian statistics | |
dc.subject | Bayesian modeling | |
dc.subject | Bayesian computation | |
dc.subject | Simplex | |
dc.title | Modeling Simplex-valued Data and Latent Structures | |
dc.type | Thesis | |
dc.description.thesisdegreename | PhD | en_US |
dc.description.thesisdegreediscipline | Statistics | |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | |
dc.contributor.committeemember | Nguyen, Long | |
dc.contributor.committeemember | Feinberg, Fred | |
dc.contributor.committeemember | Chen, Yang | |
dc.contributor.committeemember | Regier, Jeffrey | |
dc.subject.hlbsecondlevel | Statistics and Numeric Data | |
dc.subject.hlbtoplevel | Science | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/174205/1/rayleigh_1.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/5936 | |
dc.identifier.orcid | 0000-0002-0444-9708 | |
dc.identifier.name-orcid | Lei, Rayleigh; 0000-0002-0444-9708 | en_US |
dc.working.doi | 10.7302/5936 | en |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.