Show simple item record

Flexible Methods for the Analysis of Clustered Event Data in Observational Studies

dc.contributor.authorWang, Lili
dc.date.accessioned2020-10-04T23:27:01Z
dc.date.availableNO_RESTRICTION
dc.date.available2020-10-04T23:27:01Z
dc.date.issued2020
dc.date.submitted2020
dc.identifier.urihttps://hdl.handle.net/2027.42/163013
dc.description.abstractClustered event data are frequently encountered in observational studies. In this dissertation, I am focusing on correlated event outcomes clustered by subjects (multivariate events), facilities, and both hierarchically. The main approaches to analyzing correlated event data include frailty models with random effects and marginal models with robust variance estimation. Difficulties for the existing methods include a) computational demands and speed in the presence of numerous clusters (e.g., recurrent events); b) lacking rigorous diagnostic tools to prespecify the distribution of the random effects; c) analyzing a multi-state model that follows a semi-Markov renewal process. The growing need for flexible, computationally fast, and accurate estimating approaches to analyzing clustered event data motivates my methodological exploration in the following chapters. In Chapter II, I propose a log-normal correlated frailty model to analyze recurrent event incidence rates and duration jointly. The regression parameters are estimated through a penalized partial likelihood, and the variance-covariance matrix of the frailty is estimated via a recursive estimating formula. The proposed methods are more flexible and faster than existing approaches and have the potential to be extended to other frequently encountered data structures (e.g., joint modeling with longitudinal outcomes). In Chapter III, I propose a class of semiparametric frailty models that leave the distribution of frailties unspecified. Parameter estimation proceeds through estimating equations derived from first- and second-moment conditions. Estimation techniques have been developed for three different models, including a shared frailty model for a single event; a correlated frailty model for multiple events; and a hierarchically structured nested failure time model. Extensive simulation studies demonstrate that the proposed approach can accurately estimate the regression parameters, baseline event rates, and variance components. Moreover, the computation time is fast, permitting application to very large data sets. In Chapter IV, I develop a class of multi-state rate models to study the association of exposure to lead, a major endocrine disruptive agent, with behavioral changes captured by accelerometer measurements from wearable device ActiGraph GT3X. Categorized from personal activity counts over time by validated cutoffs, activity states are defined and analyzed through their in-state transitions using the proposed multi-state rate models in which the baseline rates are estimated nonparametrically. The proposed models combine the advantage of regular event rate models with the concept of competing risks, allowing to incorporate a daily renewal property and share baselines in the activity transition rates across different days. The regression parameters are specified in the event rate functions, leading to a semiparametric modeling framework. Statistical inference is based on a robust sandwich variance estimator that accounts for correlations between different event types and their recurrences. I found that the evaluated exposure to lead is associated with an increased transition from low activity to vigorous activity. Chapter V is a special project of modeling the COVID-19 surveillance data in China, in which I develop two extended susceptible-infected-recovered (SIR) state-space models under a Bayesian state-space model framework. I propose to include a time-varying transmission rate or a time-dependent quarantine process in the classical SIR model to assess the effectiveness of macro-control measures issued by the government to mitigate the pandemic. The proposed compartment models enable to predict both short-term and long-term prevalence of the COVID-19 infection with quantification of prediction uncertainty. I provide and maintain an open-source R package on GitHub (lilywang1988/eSIR) for the developed analytics.
dc.language.isoen_US
dc.subjectMultivariate Failure Time
dc.subjectClustering
dc.subjectJoint Modeling
dc.subjectFrailty Model
dc.subjectRecurrent Events
dc.titleFlexible Methods for the Analysis of Clustered Event Data in Observational Studies
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBiostatistics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberSchaubel, Douglas E
dc.contributor.committeememberSong, Peter Xuekun
dc.contributor.committeememberHirth, Richard A
dc.contributor.committeememberBanerjee, Mousumi
dc.contributor.committeememberHe, Zhi
dc.subject.hlbsecondlevelStatistics and Numeric Data
dc.subject.hlbtoplevelScience
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/163013/1/lilywang_1.pdfen_US
dc.identifier.orcid0000-0003-4276-3930
dc.identifier.name-orcidWang, Lili; 0000-0003-4276-3930en_US
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.