On Issues of Scale and Dependence in Spatial and Spatio-Temporal Data
Benedetti, Marco
2019
Abstract
Recent years have seen a massive increase in the availability of spatial and spatio-temporal datasets. With these data comes a set of practical challenges, especially when researchers use spatial statistical models to generate predictions or synthesize datasets with differing spatial resolutions. At the basis of these models lies the notion of spatial scale which, for a stationary and isotropic covariance, is quantified through a range parameter which captures the distance at which observations are considered independent in space. In this dissertation, we propose a set of statistical methods to investigate issues related to spatial scale, with the goal of providing a better characterization of the dependence structure of a spatial process. These methods are used to generate improved predictions and to generate estimates at the needed spatial resolution. Furthermore, several of the proposed methods account for the sampling mechanism of the data, whether they are derived through surveys or from non-probabilistic samples such as electronic health records (EHRs). In Chapter 2, building upon the Multi-resolution Approximation (M-RA) for large spatial data (Katzfuss, 2017), and leveraging the relationship between levels of the M-RA and the scale of a spatial process, we develop a Bayesian hierarchical model that explores and accommodates non-stationarity in spatial processes. In contrast to existing tests for global non-stationarity, our model can detect regions of local stationarity by specifying a mixture of multivariate normal priors on the basis function weights of the M-RA. Furthermore, our model outperforms other standard spatial statistical models in terms of out-of-sample prediction. In Chapter 3, we present a model for disaggregating to a fine spatio-temporal resolution estimates of proportions derived from the American Community Survey (ACS). We envision that disaggregated estimates will be better proxies of neighborhood exposure than the ACS estimates, which are resolved at either a fine spatial resolution and coarse temporal scale, or at a coarse spatial resolution and fine temporal scale. By characterizing the data as an aggregation of an underlying point-referenced process, we disaggregate the ACS estimates to the 1-year census tract resolution. Crucial to our methodological development is the incorporation of the survey’s design effect. A secondary development is a spatio-temporal version of the M-RA. In Chapter 4, we extend the disaggregation model of the previous chapter to accommodate estimates of count-valued characteristics. This chapter contains a comparison to the model of Bradley et al. (2016) (the BWH model), which addresses a similar problem for purely spatial data. In addition to accommodating spatio-temporal data, our model differs from the BWH model by incorporating the survey design effect into the model specification. We find that our model outperforms the BWH model in terms of prediction accuracy and coverage probability. In Chapter 5, we address the issue of sampling bias in EHR data, which can arise in studies of the association between disease and exposure when both the outcome variable and the exposure process are related to the sampling mechanism. Our method jointly models EHR and publicly available data to approximate sampling probabilities, which are then used to derive sampling weights. We show via simulation studies that we can recover data generating sampling probabilities and reduce bias compared to a naive analysis. To illustrate the utility of our model with clinical data, we present an analysis of smoking and lung cancer using subjects in the Michigan Genomics Initiative.Subjects
Spatial statistics Bayesian hierarchical models Spatial and spatio-temporal data Non-stationary spatial processes Survey-based estimates Sampling bias in Electronic Health Records
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.