Two Biostatistical Problems
Chase, Elizabeth
2023
Abstract
This dissertation examines two problems in biostatistics. The first and second projects develop horseshoe process regression (HPR), a Bayesian nonparametric model that uses statistical shrinkage to capture abruptly changing associations between a continuous predictor and some outcome. We use HPR to model women’s basal body temperature (BBT) across the menstrual cycle. In contrast, the third project proposes a nonparametric multiple imputation approach to estimating the cumulative incidence, a key descriptive statistic in survival analysis. Focusing on the first project, we state the truism: biomedical data often exhibit jumps or abrupt changes. These sudden changes make these data challenging to model, as many methods will oversmooth the sharp changes or overfit in response to measurement error. We develop HPR to address this problem. We define a horseshoe process as a stochastic process in which each increment is horseshoe-distributed. We use the horseshoe process as a nonparametric Bayesian prior for modeling an association between an outcome and its continuous predictor. We provide guidance and extensions to advance HPR’s use in applied practice: we introduce a Bayesian imputation scheme to allow for interpolation at unobserved values of the predictor within the HPR; include additional covariates via a partial linear model framework; and allow for monotonicity constraints. We find that HPR performs well when fitting functions that have sharp changes, and we use it to model women’s BBT over the course of the menstrual cycle. In the second project, we focus on using HPR for one particular type of abruptly changing data: BBT over the course of the menstrual cycle. Women’s BBT exhibits abrupt changes at the time of ovulation and menstruation, which many methods struggle to capture. While in the first project we demonstrated that HPR had potential for modeling BBT, in the second project we tailor HPR for this setting. We re-implement HPR using variational inference to speed computation time, which we show offers comparable results to those provided by Hamiltonian Monte Carlo in the first project. We incorporate ovulation pattern into the HPR model, to provide posterior estimates of ovulation day and its uncertainty. We consider a posterior-prior passing scheme in order to share information across cycles. We use this BBT-specific version of HPR (HPR-BBT), to analyze BBT data from a large cohort of British women. Overall, HPR-BBT offers sensible estimates of ovulation day and BBT trajectory. And now for something completely different: the third project. We propose an alternative approach to the Aalen-Johansen estimator of the cumulative incidence. Rather than calculate the cumulative incidence directly, we instead perform nonparametric multiple imputation to generate event times and types for censored individuals. Thus, on each imputation, all participants are “observed” to have an event. Calculating the cumulative incidence on each imputation is then merely estimating a proportion at each timepoint, and yields point and uncertainty estimates that can be aggregated across imputations via Rubin’s Rules. The resulting multiple imputation estimator is mathematically and empirically shown to generate equivalent point estimates to the Aalen-Johansen estimator as the number of imputations increases; in addition, the multiple imputation estimator offers improved options for uncertainty estimation. We discuss connections to redistribute-to-the-right algorithms and other imputation approaches for survival analysis.Deep Blue DOI
Subjects
Statistical shrinkage Bayesian statistics Stochastic processes Competing risks Cumulative incidence Nonparametric multiple imputation
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.