Bayesian Model Expansion for Selection Bias in Epidemiology

Trangucci, Robert Neale

Bayesian Model Expansion for Selection Bias in Epidemiology

Trangucci, Robert Neale

2023

View/Open

trangucc_1.pdf

(2.4MB

PDF)

Abstract

Selection bias is a massive problem in infectious disease epidemiology that can result in needless morbidity and mortality. This bias is both subtle and ubiquitous, occurring even in randomized clinical trials. For example, medical researchers cannot randomize responses to treatment intermediate to the outcome of interest, and epidemiologists cannot force patients to report sensitive demographic information. In order to do inference in these complex scenarios, we need new classes of models that capture the scientific process of interest while accounting for how the data were observed. In this thesis I develop theory and practice for Bayesian model expansion to mitigate and adjust for selection bias in the analysis of observational and experimental data arising in the areas of missing data, causal inference, and survey research. In the second chapter I propose a novel method to infer stratified incidence in disease surveillance data with partially-observed stratum information. Public health researchers often compare risk of disease among demographic subgroups in order to design interventions. Missingness in demographic covariates like race/ethnicity, or age group complicates this endeavor; dropping cases with missing covariates can lead to endogenous selection bias. Instead, I develop a locally-identifiable joint model for the missingness process and the disease process that allows for the missingness process to be not-missing-at-random. The model is identified by marrying spatial information in the disease data with spatial Census data. I investigate the finite-sample properties of the model via a simulation study, and apply my model to COVID-19 case data in Southeastern Michigan. I show that the burden of COVID-19 from March to July of 2020 for non-Whites relative to that of Whites is understated when cases that are missing race/ethnicity information are omitted. In the third chapter I develop a method to point-identify vaccine efficacy (VE) against post-infection outcomes such as severe illness, and death. Policy makers need to quantify post-infection outcome VE so as to design effective vaccination strategies, but these causal estimands are typically nonidentifiable. I propose a method to identify these estimands under measurement error on infection and post-infection outcomes by taking advantage of the structure of vaccine efficacy trials; these trials are typically run across different health systems and collect pretreatment covariates related to an individual's susceptibility to infection. I show that my method not only yields identifiability of the causal estimand, but also identifies the infection measurement error parameters. I then investigate the Type I error and power of my method via a simulation study. In the final chapter, I propose a new Bayesian generative semiparametric model for characterizing the cumulative spatial exposure to an environmental health hazard that is not well-represented by a single point in space, like a system of wastewater canals. The model couples a dose-response model with a log-Gaussian Cox process integrated against a distance kernel with an unknown length-scale. I show that this model is well-defined, and that a simple integral approximation adequately controls the computational error. Before applying the model to survey data from Mexico, I quantify the finite-sample properties and the computational tractability of the discretization scheme in a simulation study.

Deep Blue DOI

https://dx.doi.org/10.7302/8573

Subjects

Selection bias

Bayesian models

Missing data

Causal inference

Types

Thesis

Handle

https://hdl.handle.net/2027.42/178116

Metadata

Show full item record

Collections

Dissertations and Theses (Ph.D. and Master's)

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.