Modeling Simplex-valued Data and Latent Structures

Lei, Rayleigh

Modeling Simplex-valued Data and Latent Structures

dc.contributor.author	Lei, Rayleigh
dc.date.accessioned	2022-09-06T15:59:17Z
dc.date.available	2022-09-06T15:59:17Z
dc.date.issued	2022
dc.date.submitted	2022
dc.identifier.uri	https://hdl.handle.net/2027.42/174205
dc.description.abstract	Examples of data that lie in a simplex abound in a variety of fields. The chemical, mineral, and/or fossil percentages of a collection of rocks is of interest to geologists. Demographers and policy makers might examine the income proportions or racial compositions of neighborhoods or other political units. However, it can be challenging to model this type of data, particularly because each observation must sum up to one. If not handled correctly, this might induce what Karl Pearson called "spurious correlation". As a trivial example of this, take an observation that has two values and sums up to one. Then, the values must be negatively correlated with each other. While traditional approaches for these type of data involve analyzing the log ratio transforms, this might be problematic if any of the observations have a zero as one of their values or for interpretation. Additional challenges arise if this data changes over time. The first two chapters lay the framework for how such data may be modeled. The second chapter proposes doing so using a general affine transformation for the overall change and a sufficiently rich error model for the difference between the overall transformation and the observation at the next time point. Of the three models explored, the rotational geodesic error model is most promising. However, it might not be appropriate to assume that the direction observations moved in is uniformly distributed. Using ideas from directional statistics, we discuss in the third chapter how to model directions that appear to be similar for observations with similar values. In both chapters, we run simulation studies and analyze the income proportions from Los Angeles County. In each case, our analysis is able to discover trends consistent with larger macroeconomic ones and provide further details about these trends. The last chapter discusses tree-based mixtures of probability simplices. In other words, the simplices share vertices in a way that can be represented by a tree with the root node corresponding to a vertex shared by all simplices and the leaf nodes corresponding to vertices present in one simplex. We show when such models have posterior consistency and demonstrate how to efficiently fit them using geometric methods. Indeed, we apply them to analyze a subset of articles from the New York Times, uncovering meaningful topics and interesting semantic relationships between these topics. While we leave it to future work, these methods might also be combined with the ones from previous chapters to model how sub-regions of data that lie in a simplex change over time.
dc.language.iso	en_US
dc.subject	Bayesian statistics
dc.subject	Bayesian modeling
dc.subject	Bayesian computation
dc.subject	Simplex
dc.title	Modeling Simplex-valued Data and Latent Structures
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Statistics
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Nguyen, Long
dc.contributor.committeemember	Feinberg, Fred
dc.contributor.committeemember	Chen, Yang
dc.contributor.committeemember	Regier, Jeffrey
dc.subject.hlbsecondlevel	Statistics and Numeric Data
dc.subject.hlbtoplevel	Science
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/174205/1/rayleigh_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/5936
dc.identifier.orcid	0000-0002-0444-9708
dc.identifier.name-orcid	Lei, Rayleigh; 0000-0002-0444-9708	en_US
dc.working.doi	10.7302/5936	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: rayleigh_1.pdf
Size:: 25.12MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.