Bayesian Models for Multi-omic Multi-system Integration for Precision Oncology
Bhattacharyya, Rupam
2023
Abstract
The molecular heterogeneity of cancer makes it challenging to delineate the underlying mechanisms and optimize therapeutic avenues. Large-scale cancer datasets across multiple dimensions (such as omics and clinical data types, or cancer systems including patients and cell lines) offer assistance towards mitigating these challenges via a granular yet holistic view of the disease. While integrative approaches have the potential to both unmask novel functional mechanisms and prioritize therapeutic targets, their development and implementation is challenging due to data variety and the underlying dependence within/between such datasets. In this dissertation, I focus on developing Bayesian statistical procedures that can take advantage of the diversity offered by such databases while taking into account the associated biological and statistical challenges. In Chapter II, I develop TransPRECISE, a multiscale Bayesian network modeling framework, to analyze the pan-cancer patient and cell line interactome. I assess pan-cancer pathway activities of patients from 31 tumor types and cell lines from 16 lineages, along with the cell lines’ response to 481 drugs. TransPRECISE captures differential and conserved proteomic pathway circuitries between multiple patient and cell line lineages. Tumor stratification using these learned networks uncovers distinct clinical subtypes of patient cancers characterized by different cell line avatars. High predictive accuracy is observed for cell line drug sensitivities using Bayesian additive regression tree models with TransPRECISE pathway scores as predictors. In Chapter III, I propose fiBAG, an integrative hierarchical Bayesian framework for modeling the fundamental biological relationships underlying cross-platform molecular features of cancer. Using Gaussian process models, fiBAG identifies upstream functional evidence for proteogenomic biomarkers. By mapping said evidence to prior inclusion probabilities, a calibrated Bayesian variable selection (cBVS) model is built to identify biomarkers associated with an outcome of interest. Simulation studies show that cBVS has higher power to detect disease-related markers than non-integrative approaches. Via an integrative proteogenomic analysis of 14 cancer datasets, several known and novel genes/proteins associated with cancer stemness and patient survival are identified. While multi-omic patient databases have sparse drug response, cancer model systems databases provide extensive pharmacogenomic profiles, albeit with lower sample sizes, resulting in reduced statistical power. For this reason, in Chapter IV, I propose BaySyn - a hierarchical Bayesian evidence synthesis framework that detects functionally relevant driver genes based on their associations with upstream regulators and uses this evidence to calibrate Bayesian variable selection models in the (drug) outcome layer. I use BaySyn to analyze multi-omic patient and cell line datasets across pan-gynecological cancers. BaySyn mechanistic models implicate several known functional genes in GO and KEGG gene sets of interest in the cancers assessed. Further, the BaySyn outcome model makes more discoveries than its uncalibrated counterparts under equal Type I error control. In Chapter V, I focus on incorporating tumor heterogeneity in clinicogenomic models. To this end, I propose GPVIBES, a Gaussian process-based varying coefficient model using Bayesian variable selection, to model the association between a biomarker and an outcome as a function of a hierarchical covariate equipped with horseshoe prior-based shrinkage. Simulation studies with one or more hierarchical covariates show that at the same signal-to-noise and sample-size-to-dimensionality ratios, GPVIBES yields improved selection performance alongside accurate estimates of the coefficient function, compared to other varying-coefficient-based models. A pan-cancer integrative analysis of 16 cancers identified modulation of proteomic associations via several known signatures.Deep Blue DOI
Subjects
Bayesian Machine Learning Integrative Statistical Models Precision Oncology Pharmacogenomics Clinicogenomic Models Bayesian Variable Selection
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.