Show simple item record

Aligning Machine Learning Solutions with Clinical Needs

dc.contributor.authorKamran, Fahad
dc.date.accessioned2024-02-13T21:17:30Z
dc.date.available2024-02-13T21:17:30Z
dc.date.issued2023
dc.date.submitted2023
dc.identifier.urihttps://hdl.handle.net/2027.42/192371
dc.description.abstractThe availability of large observational datasets in healthcare presents an opportunity to leverage machine learning techniques to learn complex relationships between an individual’s characteristics, underlying health status, and response to interventions. Despite progress, there is often a mismatch between how machine learning models are developed and clinical needs. In this dissertation, we study how considering clinical needs can and should inform model development in healthcare. First, in survival analysis, deep learning approaches have been proposed for estimating an individual's survival probability over some time horizon. However, these methods often focus on optimizing discriminative performance and have ignored model calibration. Well-calibrated survival curves present realistic and meaningful probabilistic estimates of the true underlying survival process for an individual, an essential characteristic for survival analysis models in many clinical contexts. In light of the shortcomings of existing approaches, we propose a new training scheme for optimizing deep survival analysis models for strong discriminative performance and good calibration. Across two clinical datasets, we show that our approach yields models with strong discriminative performance while improving calibration over existing methods. Second, in causal inference, past work has focused on accurately estimating conditional average treatment effects (CATEs) to help guide treatment allocation. However, in many settings, decision-makers only require a ranking of individuals to assist in allocating treatments. Leveraging the insight that ranking can be simpler than CATE estimation and better CATE accuracy doesn't necessarily translate to better treatment allocation, we propose an approach that optimizes directly for rankings of individuals to maximize benefit of treatment. Our tree-based approach maximizes the expected benefit across all treatment thresholds using a novel splitting criteria. Through experiments on synthetic datasets, we show that the proposed approach leads to better sample efficiency and better treatment assignments, as measured by expected benefit, compared to models optimized for accurate CATEs. Third, when exact CATEs are needed, we study the mismatch between theoretical results in CATE estimation and how this theory holds empirically. In recent years, techniques incorporating estimates of both the propensity score and potential outcomes have gained popularity in part due to their strong theoretical guarantees for overcoming confounding bias. However, how this theory translates to practice across an extensive set of practical settings, especially in the context of deep learning, has not been well explored. We present an in-depth exploration of popular techniques, finding that those relying only on estimates of the outcome, in particular the X-Learner, can consistently outperform more sophisticated techniques across a variety of practical settings. Finally, we study how the mismatch between machine learning objectives and clinical needs manifests in existing clinical tools for sepsis risk stratification. Standard risk-stratification approaches focus on predicting the likelihood of sepsis before the sepsis criteria is met. However, both the training and evaluation of these models do not match the ultimate goal of augmenting clinical decision-making to improve patient outcomes. We study both challenges, finding that: 1) existing risk stratification approaches deteriorate significantly when evaluating before clinical recognition of sepsis and 2) targeting those most likely to develop sepsis may be sub-optimal with respect to improving patient outcomes. Overall, our contributions bridge, in part, the gap between machine learning research and practice in healthcare. Ultimately, by recognizing domain-specific needs in clinical care as we have, machine learning practitioners can develop more impactful models.
dc.language.isoen_US
dc.subjectmachine learning
dc.subjectcausal effect estimation
dc.subjectsurvival analysis
dc.subjectrisk stratification
dc.subjectresource allocation
dc.titleAligning Machine Learning Solutions with Clinical Needs
dc.typeThesis
dc.description.thesisdegreenamePhD
dc.description.thesisdegreedisciplineComputer Science & Engineering
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberWiens, Jenna
dc.contributor.committeememberLadhania, Rahul
dc.contributor.committeememberFouhey, David Ford
dc.contributor.committeememberMakar, Maggie
dc.subject.hlbsecondlevelComputer Science
dc.subject.hlbtoplevelEngineering
dc.contributor.affiliationumcampusAnn Arbor
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/192371/1/fhdkmrn_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/22280
dc.identifier.orcid0000-0003-2488-8887
dc.identifier.name-orcidKamran, Fahad; 0000-0003-2488-8887en_US
dc.working.doi10.7302/22280en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.