Machine Learning Approaches for Quantitative Analysis and Characterization of Pathological Speech Disorders

Perez, Matthew

Machine Learning Approaches for Quantitative Analysis and Characterization of Pathological Speech Disorders

Perez, Matthew

2024

View/Open

mkperez_1.pdf

(1.7MB

PDF)

Abstract

Automatic quantitative analyses for characterizing pathological speech have the potential to aid medical professionals in assessing disease severity. Pathological speech disorders can manifest in a variety of ways, which creates an opportunity for speech-based biomarkers to capture these differences. Ultimately, developing these quantitative measurements for tracking disease progression is crucial for assisting clinicians with evaluation and treatment planning. Quantitative speech analysis serves multiple functions, including providing tools for better transcription, features that capture specific acoustic patterns, and machine learning models that detect symptoms, characteristics, or events. In this thesis, I focus on improving quantitative speech analyses for two pathological speech disorders, Huntington's Disease (HD) and aphasia. HD is a neurodegenerative disease that causes motor dysfunction and manifests itself in speech production difficulties while aphasia is a language disorder that manifests in wide variety of speech and language impairments. Despite these differences, the framework for developing effective quantitative speech analyses is similar and can be extended to help evaluate broader classes of pathological speech disorders. This quantitative speech analysis framework involves three stages (characterize, recognize, and analyze). In this thesis, I introduce different machine learning methods within each of these stages to enable more effective analyses. First, I focus on techniques for analyzing HD and show that automated feature extraction of text-based features from automatic speech recognition (ASR) generated scripts can be used to classify the presence of HD. Additionally, I show these features are correlated with disease severity making them candidates for severity tracking. I then investigate low-level, speech features taken from the frequency domain. These features can be automatically extracted and are meant to measure vocal tract coordination. I demonstrate that these features scale in performance with audio length and outperform previous transcript-based features. I investigate transcript and acoustic biomarkers for automatically analyzing HD speech, which can be useful for downstream clinical applications. Next, I explore acoustic model improvements for automatically transcribing aphasic speech. ASR represents an important area of research for quantitative speech analysis as many downstream features rely on accurate speech transcriptions. Some challenges with disordered speech ASR include data scarcity and high speaker variability. I focus on first improving ASR for aphasic speech using a mixture-of-experts acoustic model that models the variability across speakers' speech intelligibility. I show that using trained sub-networks based on severity improves acoustic modeling performance. Lastly, I explore methods for improving the characterizing aphasic speech through automatic speech error detection. Specifically, this involves identifying different types of errors, known as paraphasias, among individuals with aphasia. I explore the effectiveness of end-to-end machine learning models for automatic paraphasia detection and show significant performance improvement over existing multi-step machine learning approaches. In conclusion, the goal of my thesis is to explore various methods, features, and model architectures related to the different aspects of quantitative speech analysis, in order to enable more effective clinical pathological speech assessment.

Deep Blue DOI

https://dx.doi.org/10.7302/23980

Subjects

Disordered Speech Analysis

Acoustic Feature Extraction

Disordered Speech Recognition

Paraphasia Detection

Types

Thesis

Handle

https://hdl.handle.net/2027.42/194632

Metadata

Show full item record

Collections

Dissertations and Theses (Ph.D. and Master's)

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.