Show simple item record

On Issues of Scale and Dependence in Spatial and Spatio-Temporal Data

dc.contributor.authorBenedetti, Marco
dc.date.accessioned2020-01-27T16:23:35Z
dc.date.availableNO_RESTRICTION
dc.date.available2020-01-27T16:23:35Z
dc.date.issued2019
dc.date.submitted2019
dc.identifier.urihttps://hdl.handle.net/2027.42/153375
dc.description.abstractRecent years have seen a massive increase in the availability of spatial and spatio-temporal datasets. With these data comes a set of practical challenges, especially when researchers use spatial statistical models to generate predictions or synthesize datasets with differing spatial resolutions. At the basis of these models lies the notion of spatial scale which, for a stationary and isotropic covariance, is quantified through a range parameter which captures the distance at which observations are considered independent in space. In this dissertation, we propose a set of statistical methods to investigate issues related to spatial scale, with the goal of providing a better characterization of the dependence structure of a spatial process. These methods are used to generate improved predictions and to generate estimates at the needed spatial resolution. Furthermore, several of the proposed methods account for the sampling mechanism of the data, whether they are derived through surveys or from non-probabilistic samples such as electronic health records (EHRs). In Chapter 2, building upon the Multi-resolution Approximation (M-RA) for large spatial data (Katzfuss, 2017), and leveraging the relationship between levels of the M-RA and the scale of a spatial process, we develop a Bayesian hierarchical model that explores and accommodates non-stationarity in spatial processes. In contrast to existing tests for global non-stationarity, our model can detect regions of local stationarity by specifying a mixture of multivariate normal priors on the basis function weights of the M-RA. Furthermore, our model outperforms other standard spatial statistical models in terms of out-of-sample prediction. In Chapter 3, we present a model for disaggregating to a fine spatio-temporal resolution estimates of proportions derived from the American Community Survey (ACS). We envision that disaggregated estimates will be better proxies of neighborhood exposure than the ACS estimates, which are resolved at either a fine spatial resolution and coarse temporal scale, or at a coarse spatial resolution and fine temporal scale. By characterizing the data as an aggregation of an underlying point-referenced process, we disaggregate the ACS estimates to the 1-year census tract resolution. Crucial to our methodological development is the incorporation of the survey’s design effect. A secondary development is a spatio-temporal version of the M-RA. In Chapter 4, we extend the disaggregation model of the previous chapter to accommodate estimates of count-valued characteristics. This chapter contains a comparison to the model of Bradley et al. (2016) (the BWH model), which addresses a similar problem for purely spatial data. In addition to accommodating spatio-temporal data, our model differs from the BWH model by incorporating the survey design effect into the model specification. We find that our model outperforms the BWH model in terms of prediction accuracy and coverage probability. In Chapter 5, we address the issue of sampling bias in EHR data, which can arise in studies of the association between disease and exposure when both the outcome variable and the exposure process are related to the sampling mechanism. Our method jointly models EHR and publicly available data to approximate sampling probabilities, which are then used to derive sampling weights. We show via simulation studies that we can recover data generating sampling probabilities and reduce bias compared to a naive analysis. To illustrate the utility of our model with clinical data, we present an analysis of smoking and lung cancer using subjects in the Michigan Genomics Initiative.
dc.language.isoen_US
dc.subjectSpatial statistics
dc.subjectBayesian hierarchical models
dc.subjectSpatial and spatio-temporal data
dc.subjectNon-stationary spatial processes
dc.subjectSurvey-based estimates
dc.subjectSampling bias in Electronic Health Records
dc.titleOn Issues of Scale and Dependence in Spatial and Spatio-Temporal Data
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBiostatistics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberBerrocal, Veronica J
dc.contributor.committeememberO'Neill, Marie Sylvia
dc.contributor.committeememberLittle, Roderick J
dc.contributor.committeememberMukherjee, Bhramar
dc.subject.hlbsecondlevelPublic Health
dc.subject.hlbsecondlevelStatistics and Numeric Data
dc.subject.hlbtoplevelHealth Sciences
dc.subject.hlbtoplevelScience
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/153375/1/benedetm_2.pdf
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/153375/2/benedetm_1.pdf
dc.identifier.orcid0000-0002-6334-1975
dc.identifier.name-orcidBenedetti, Marco; 0000-0002-6334-1975en_US
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.