Aligning Machine Learning Solutions with Clinical Needs
Kamran, Fahad
2023
Abstract
The availability of large observational datasets in healthcare presents an opportunity to leverage machine learning techniques to learn complex relationships between an individual’s characteristics, underlying health status, and response to interventions. Despite progress, there is often a mismatch between how machine learning models are developed and clinical needs. In this dissertation, we study how considering clinical needs can and should inform model development in healthcare. First, in survival analysis, deep learning approaches have been proposed for estimating an individual's survival probability over some time horizon. However, these methods often focus on optimizing discriminative performance and have ignored model calibration. Well-calibrated survival curves present realistic and meaningful probabilistic estimates of the true underlying survival process for an individual, an essential characteristic for survival analysis models in many clinical contexts. In light of the shortcomings of existing approaches, we propose a new training scheme for optimizing deep survival analysis models for strong discriminative performance and good calibration. Across two clinical datasets, we show that our approach yields models with strong discriminative performance while improving calibration over existing methods. Second, in causal inference, past work has focused on accurately estimating conditional average treatment effects (CATEs) to help guide treatment allocation. However, in many settings, decision-makers only require a ranking of individuals to assist in allocating treatments. Leveraging the insight that ranking can be simpler than CATE estimation and better CATE accuracy doesn't necessarily translate to better treatment allocation, we propose an approach that optimizes directly for rankings of individuals to maximize benefit of treatment. Our tree-based approach maximizes the expected benefit across all treatment thresholds using a novel splitting criteria. Through experiments on synthetic datasets, we show that the proposed approach leads to better sample efficiency and better treatment assignments, as measured by expected benefit, compared to models optimized for accurate CATEs. Third, when exact CATEs are needed, we study the mismatch between theoretical results in CATE estimation and how this theory holds empirically. In recent years, techniques incorporating estimates of both the propensity score and potential outcomes have gained popularity in part due to their strong theoretical guarantees for overcoming confounding bias. However, how this theory translates to practice across an extensive set of practical settings, especially in the context of deep learning, has not been well explored. We present an in-depth exploration of popular techniques, finding that those relying only on estimates of the outcome, in particular the X-Learner, can consistently outperform more sophisticated techniques across a variety of practical settings. Finally, we study how the mismatch between machine learning objectives and clinical needs manifests in existing clinical tools for sepsis risk stratification. Standard risk-stratification approaches focus on predicting the likelihood of sepsis before the sepsis criteria is met. However, both the training and evaluation of these models do not match the ultimate goal of augmenting clinical decision-making to improve patient outcomes. We study both challenges, finding that: 1) existing risk stratification approaches deteriorate significantly when evaluating before clinical recognition of sepsis and 2) targeting those most likely to develop sepsis may be sub-optimal with respect to improving patient outcomes. Overall, our contributions bridge, in part, the gap between machine learning research and practice in healthcare. Ultimately, by recognizing domain-specific needs in clinical care as we have, machine learning practitioners can develop more impactful models.Deep Blue DOI
Subjects
machine learning causal effect estimation survival analysis risk stratification resource allocation
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.