Show simple item record

The Modulatory Effects of Visual Speech on Auditory Speech Perception: A Multi-Modal Investigation of How Vision Alters the Temporal, Spatial and Spectral Components of Speech

dc.contributor.authorG, Karthik
dc.date.accessioned2022-05-25T15:21:34Z
dc.date.available2022-05-25T15:21:34Z
dc.date.issued2022
dc.date.submitted2022
dc.identifier.urihttps://hdl.handle.net/2027.42/172599
dc.description.abstractVisual speech information, especially that provided by the mouth and lips, is important during face-to-face communication. This has been made more evident by the increased difficulty of speech perception because mask usage has become commonplace in response to the COVID-19 pandemic. Masking obscures the mouth and lips, thus eliminating meaningful information from visual cues that are used to perceive speech correctly. To fully understand the perceptual benefits afforded by visual information during audiovisual speech perception, it is necessary to explore the underlying neural mechanisms involved. While several studies have shown neural activation of auditory regions in response to visual speech, the information represented by these activations remain poorly understood. The objective of this dissertation is to investigate the neural bases for how visual speech modulates the temporal, spatial, and spectral components of audiovisual speech perception, and the type of information encoded by these signals. Most studies approach this question by using techniques sensitive to one or two important dimensions (temporal, spatial, or spectral). Even in studies that have used intracranial electroencephalography (iEEG), which is sensitive to all three dimensions, research conventionally quantifies effects using single-subject statistics, leaving group-level variance unexplained. In Study 1, I overcome these shortcomings by investigating how vision modulates auditory speech processes across spatial, temporal and spectral dimensions in a large group of epilepsy patients with intracranial electrodes implanted (n = 21). The results of this study demonstrate that visual speech produced multiple spatiotemporally distinct patterns of theta, beta, and high-gamma power changes in auditory regions in the superior temporal gyrus (STG). While study 1 showed that visual speech evoked activity in auditory areas, it is not clear what, if any, information is encoded by these activations. In Study 2, I investigated whether these distinct patterns of activity in the STG, produced by visual speech, contain information about what word is being said. To address this question, I utilized a support-vector machine classifier to decode the identities of four word types (consonants beginning with ‘b’, ‘d’, ‘g’, and ‘f’) from activity in the STG recorded during spoken (phonemes: basic units of speech) or silent visual speech (visemes: basic units of lipreading information). Results from this study indicated that visual speech indeed encodes lipreading information in auditory regions. Studies 1 and 2 provided evidence from iEEG data obtained from patients with epilepsy. In order to replicate these results in a normative population and to leverage improved spatial resolution, in Study 3 I acquired data from a large cohort of normative subjects (n = 64) during a randomized event-related functional magnetic resonance imaging (fMRI) experiment. Similar to that of Study 2, I used machine learning to test for classification of phonemes and visemes (/fafa/, /kaka/, /mama/) from auditory, auditory-visual, and visual regions in the brain. Results conceptually replicated the results of Study 2, such that phoneme and viseme identities could both be classified from the STG, revealing that this information is encoded through distributed representations. Further analyses revealed similar spatial patterns in the STG between phonemes and visemes, consistent with the model that viseme information is used to target corresponding phoneme populations in auditory regions. Taken together, the findings from this dissertation advance our understanding of the neural mechanisms that underlie the multiple ways in which vision alters the temporal, spatial and spectral components of audiovisual speech perception.
dc.language.isoen_US
dc.subjectMulti-modal imaging
dc.subjectAudiovisual speech processing
dc.subjectMultisensory perception
dc.subjectIntracranial EEG
dc.subjectfMRI
dc.titleThe Modulatory Effects of Visual Speech on Auditory Speech Perception: A Multi-Modal Investigation of How Vision Alters the Temporal, Spatial and Spectral Components of Speech
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplinePsychology
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberBrang, David Joseph
dc.contributor.committeememberLiu, Zhongming
dc.contributor.committeememberJahn, Andrew
dc.contributor.committeememberSripada, Sekhar Chandra
dc.contributor.committeememberWeissman, Daniel Howard
dc.subject.hlbsecondlevelComputer Science
dc.subject.hlbsecondlevelNeurosciences
dc.subject.hlbsecondlevelPsychology
dc.subject.hlbtoplevelEngineering
dc.subject.hlbtoplevelHealth Sciences
dc.subject.hlbtoplevelSocial Sciences
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/172599/1/gkarthik_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/4628
dc.identifier.orcid0000-0002-3029-2273
dc.identifier.name-orcidGanesan, Karthikeyan; 0000-0002-3029-2273en_US
dc.working.doi10.7302/4628en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.