Show simple item record

Statistical Methods for Complex Data: Hospital Evaluation and Causal Inference

dc.contributor.authorXu, Tongbo
dc.date.accessioned2025-05-12T17:39:13Z
dc.date.available2025-05-12T17:39:13Z
dc.date.issued2025
dc.date.submitted2025
dc.identifier.urihttps://hdl.handle.net/2027.42/197238
dc.description.abstractReal-world data often have complex structures that exceed the limitations of traditional cross-sectional study frameworks, which introduces challenges for statistical analysis. This dissertation addresses these challenges by developing novel statistical methodologies that accommodate various data structures, including complex relationships between outcome and covariates, clustered data (e.g., patients nested within hospitals), and data integration settings. Specifically, Chapter 2 introduces a novel random forest approach for hospital evaluation in clustered data. Chapters 3 and 4 focus on innovative causal inference frameworks and methods designed for clustered data and data integration scenarios. In Chapter 2, we introduce Fixed Effect Clustered Random Forest (FCRF) to model the relationship between outcome and the covariates in the clustered data, with the aim of applications such as hospital evaluation. This approach incorporates the hierarchical structure of fixed-effect clustered models and the flexibility of random forests through an iterative algorithm. To address potential overfitting and bias in random forests, we integrate bias correction techniques and introduce Fixed Effect Clustered Random Forest with Bias Correction (FCRFBC). Simulation studies confirm the effectiveness of these methods. We further illustrate their effectiveness by analyzing data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) to evaluate hospitals in managing patients' hemoglobin levels with standardized medications, demonstrating advantages over conventional approaches. In Chapter 3, we develop a framework for making causal inferences on treatment effects in clustered data, which are frequently encountered in large observational clinical studies, where patients are nested within hospitals. While causal inference methods have been well-established for cross-sectional data, the causal estimands and assumptions in clustered data are explored less. Within a new potential outcome framework designed for clustered data, We define a series of causal estimands and assumptions required for valid inference. We then propose new cluster-level weighted propensity score weighting methods that consistently estimate these treatment effects, demonstrated both theoretically and through simulations. We also apply these methods to the BMC2 dataset for empirical illustration. In Chapter 4, we focus on estimating the causal effects for an internal study of interest, while summary information from multiple external studies can be used to potentially improve the efficiency of estimation, which is a typical data integration scenario. We introduce Penalized Empirical Augmented Inverse Propensity Weighting (PEAIPW), a penalized empirical likelihood method that employs the group lasso technique to select and incorporate external information useful for internal causal effect estimation to improve the efficiency of the causal effect estimation. Through both theoretical analysis and simulations, we investigate the properties and performance of this method, including its selection consistency for external information, double-robustness property, and potential for efficiency gains.
dc.language.isoen_US
dc.subjectcausal inference
dc.subjectrandom forest
dc.subjectdata integration
dc.subjectclustered data
dc.subjecthospital evaluation
dc.titleStatistical Methods for Complex Data: Hospital Evaluation and Causal Inference
dc.typeThesis
dc.description.thesisdegreenamePhD
dc.description.thesisdegreedisciplineBiostatistics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberShi, Xu
dc.contributor.committeememberZhang, Min
dc.contributor.committeememberBakulski, Kelly Marie
dc.contributor.committeememberHe, Zhi
dc.subject.hlbsecondlevelStatistics and Numeric Data
dc.subject.hlbtoplevelScience
dc.contributor.affiliationumcampusAnn Arbor
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/197238/1/tongboxu_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/25664
dc.identifier.orcid0009-0008-3205-2370
dc.identifier.name-orcidXu, Tongbo; 0009-0008-3205-2370en_US
dc.working.doi10.7302/25664en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.