Data Integration Methods for Time-to-Event Outcomes and Measuring the Risk of Surrogate Paradox in Sub-Populations
Shafie Khorassani, Fatema
2023
Abstract
This dissertation proposes methods for leveraging existing data sources to answer new public health research questions in two different areas. In the first project we develop methods for validating surrogate outcomes when there is data available on several prior trials of the same surrogate and true outcome combination. In the second and third project, we develop methods for data fusion using time-to-event outcomes motivated by studying the factors associated with in mortality from head and neck cancer. Clinical trials often collect intermediate or surrogate endpoints other than the true endpoint of interest. There are settings in which the proposed surrogate endpoint is positively correlated with the true endpoint, but the treatment has opposite effects on the surrogate and true endpoints, a phenomenon labeled “surrogate paradox”. Covariate information may be useful in predicting an individual’s risk of surrogate paradox. In the first project, we consider the issue of validating surrogate outcomes. We propose methods for incorporating covariate information into measures of assessing the risk of surrogate paradox using the meta-analytic causal association framework. The measures calculate the probability that a treatment will have opposite effects on the surrogate and true endpoints and determine the size of a true positive treatment effect on the surrogate endpoint that would reduce the risk of a negative treatment effect on the true endpoint as a function of covariates, allowing the effects of the covariates on the surrogate and true endpoint to vary across trials. In the second project, we develop methods for data-fusion with a time-to-event outcome with the goal of combining information from two separate data sources, one of which includes the outcome of interest, and the other which includes a set of important confounders. Some existing missing data methods have been extended to this data fusion setting, but they do not allow for censored time-to-event outcomes. To develop data fusion methods for time-to-event outcomes we use the equivalence between the likelihoods of a proportional hazards model with piece-wise constant baseline hazards on pre-specified intervals of follow-up and a Poisson log-linear likelihood using transformed data with pseudo-observations for each combination of individual and interval. This project is motivated by studying the factors associated with racial disparities in cancer mortality. Many factors may confound the association between race and cancer-specific mortality, including healthcare access, socioeconomic status, and comorbidities. Existing national cancer surveillance databases each collect parts of this information. When estimating disparities in cancer mortality, using the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) registry means excluding information on important confounders like hospital type, insurance status, and comorbidities. On the other hand, the National Cancer Database (NCDB), which does provide information on those variables, excludes cause-of-death information, making it impossible to estimate cancer-specific mortality. Integrating data from both sources allows us to study associations between race and cancer-specific mortality adjusted for important confounders. In the third project, we extend the methods from the second project to propose data fusion methods for multiple data sources, when no single dataset contains both the outcome and all the covariates of interest. We provide estimating equations for data fusion with normal or time-to-event outcomes and compare two estimation procedures. We apply the methods to study mortality from head and neck cancer using data from a University of Michigan cohort study combined with data from SEER and NCDB.Deep Blue DOI
Subjects
Biostatistics Data Fusion Time-to-Event Outcomes Cancer Mortality
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.