Electrocorticographic (ECoG) dataset of an audiovisual task

Brang, David; Karthik, Ganesan

Work Description

Title: Electrocorticographic (ECoG) dataset of an audiovisual task Open Access Deposited

Attribute	Value
Methodology	A de-identified electrocorticographic (ECoG) dataset obtained from 21 subjects performing an audiovisual task was collected from 4 hospitals. The data is preprocessed according to the descriptions provided in the readme file. The data is provided in three distinct frequency bands; Theta (3-7Hz), Beta (13-30Hz) and high gamma power (70-150Hz). The task consisted of phonemes/visemes in auditory alone and congruent audio-visual stimuli. Only data from electrodes in the superiortemporal, middletemporal and supramarginal regions have been included in the dataset. Electrode locations and their corresponding MNI vertices labelled according to freesurfer annotations at different resolutions (1mm, 2mm, 4mm, 10mm and 20mm) are included. Matlab codes to replicate the results in the accompanying manuscript are also included.
Description	Data were acquired from 21 patients with intractable epilepsy undergoing clinical evaluation using iEEG. Patients ranged in age from 15-58 years (mean = 37.1, SD = 12.8) and included 10 females. Across all patients, data was recorded from a total of 1367 electrodes. Each participant were presented with multiple trials of auditory only and congruent audio-visual stimuli. On each trial a single phoneme was presented to the participant. Three variants of the tasks were used with each variant consisting of a different set of phonemes (variant A: /ba/ /da/ /ta/ /tha/, variant B: /ba/ /da/ /ga/, variant C: /ba/ /ga/ /ka/ /pa/). Trials were presented in a random order and phonemes were distributed uniformly across conditions. While conditions were matched in terms of trial numbers, participants completed a variable number of trials (based on task variant and the number of blocks completed). All provided data has been resampled to 1024 Hz during initial stages of processing for all participants. Data has been referenced in a bipolar fashion (signals subtracted from each immediately adjacent electrode in a pairwise manner) to ensure that the observed signals were derived from maximally local neuronal populations. The preprocessing steps followed have been described in the detailed description document in the attached materials. The dataset zip folder consists of three main sub-folders: 1) Electrodes: This folder provides details regarding the individual electrodes for each subject, their MNI coordinates as well as their MNI vertices information according to freesurfer parcellations. This folder also consists of images of the physical location of each of the electrode sets. 2) Processed: This folder contains preprocessed data in all three frequencies (theta, beta and high gamma power) for individual subjects and the corresponding vertex locations for each of the electrodes from which their data was recorded. The images subfolder also contains figures provided in the main manuscript. 3) MatlabCodes: This folder contains all the matlab scripts required to reproduce the results provided in the main manuscript. LME_AvsAV_Main_Windows.m is the main file that an user has to run to reproduce the results.
Creator	Brang, David Karthik, Ganesan
Depositor	[email protected]
Contact information	[email protected]
Discipline	Science
Funding agency	National Institutes of Health (NIH)
Keyword	Cognitive Neuroscience Neuroscience Perception Mixed Effects Models
Citations to related material	Ganesan, K., Plass, J., Beltz, A. M., Liu, Z., Grabowecky, M., Suzuki, S., ... & Brang, D. (2020). Visual speech differentially modulates beta, theta, and high gamma bands in auditory cortex. bioRxiv. https://doi.org/10.1101/2020.09.07.284455
Related items in Deep Blue Documents	Ganesan, K., Plass, J., Beltz, A. M., Liu, Z., Grabowecky, M., Suzuki, S., ... & Brang, D. (2020). Visual speech differentially modulates beta, theta, and high gamma bands in auditory cortex. http://hdl.handle.net/2027.42/167729
Resource type	Dataset
Last modified	11/18/2022
Published	06/01/2021
Language	English MatLab
DOI	https://doi.org/10.7302/xhb4-j609
License	http://creativecommons.org/licenses/by-nc/4.0/

To Cite this Work:
Brang, D., Karthik, G. (2021). Electrocorticographic (ECoG) dataset of an audiovisual task [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/xhb4-j609

Relationships


This work is not a member of any user collections.

Files (Count: 2; Size: 8.16 GB)

Thumbnailthumbnail-column	Title	Original Upload	Last Modified	File Size	Access	Actions
	readme.txt	2021-05-29	2021-05-29	11.2 KB	Open Access	View Details Download
	LME_AvAV_Speech_DeepBlue.zip	2021-05-28	2021-05-29	8.16 GB	Open Access	View Details Download

Date: 25 May, 2021

Dataset Title: Electrocorticographic (ECoG) dataset of an audiovisual task.

Dataset Creators: Karthik Ganesan, John Plass, Marcia Grabowecky, Satoru Suzuki, William C Stacey, Vibhangini S Wasade, Vernon L Towle, James X Tao, Shasha Wu, Naoum P Issa, David Brang

Dataset Contact: David Brang [email protected]

Funding: NIH Grant R00 DC013828

Key Points:
- We provide a dataset obtained from intracranial EEG (iEEG) / Electrocorticographic (ECoG) recordings.
- A total of 21 participants completed the tasks that involved an auditory only and a congruent audio-visual task.
- The data is fully preprocessed and ready for analysis in three unique frequency bands; Theta (3-7Hz), Beta (13-30Hz) and high gamma power (70-150Hz).

Research Overview:
Speech perception is a central component of social communication. While speech perception is primarily driven by sounds, accurate perception in everyday settings is also supported by meaningful information extracted from visual cues (e.g., speech content, timing, and speaker identity). Previous research has shown that visual speech modulates activity in cortical areas subserving auditory speech perception, including the superior temporal gyrus (STG), likely through feedback connections from the multisensory posterior superior temporal sulcus (pSTS). However, it is unknown whether visual modulation of auditory processing in the STG is a unitary phenomenon or, rather, consists of multiple temporally, spatially, or functionally discrete processes. To explore these questions, we examined neural responses to audiovisual speech in electrodes implanted intracranially in the temporal cortex of 21 patients undergoing clinical monitoring for epilepsy. We found that visual speech modulates auditory processes in the STG in multiple ways, eliciting temporally and spatially distinct patterns of activity that differ across theta, beta, and high-gamma frequency bands. In this dataset, we provide the entire set of recordings we utilized in this research.

Background:

Data were acquired from 21 patients with intractable epilepsy undergoing clinical evaluation using iEEG. Patients ranged in age from 15-58 years (mean = 37.1, SD = 12.8) and included 10 females. iEEG was acquired from clinically implanted depth electrodes (5 mm center-to-center spacing, 2 mm diameter) and/or subdural electrodes (10 mm center-to-center spacing, 3 mm diameter): 13 patients had subdural electrodes and 17 patients had depth electrodes (Supplementary Figure 1). Across all patients, data was recorded from a total of 1367 electrodes (mean = 65, SD = 25.3, range = 24 - 131 per participant). The number, location, and type of electrodes used were based on the clinical needs of the participants. iEEG recordings were acquired at either 1000 Hz (n = 5), 1024 Hz ( n = 11 participants), or 4096 Hz (n = 5 Participants) due to differences in clinical amplifiers. All participants provided informed consent under an institutional review board (IRB)-approved protocol at the University of Chicago, Rush University, University of Michigan, or Henry Ford hospital.

Participants were tested in the hospital at their bedside using a 15-inch MacBook Pro computer running Psychtoolbox (Kleiner et al., 2007). Auditory stimuli were presented through a pair of free-field speakers placed approximately 15 degrees to each side of the patient's midline, adjacent to the laptop. Data were aggregated from three audiovisual speech perception paradigms (using different phonemes spoken by different individuals across tasks) to ensure generalizability of results: 7 participants completed variant A, 8 participants variant B, and 6 participants variant C. Each task presented participants with either auditory only or congruent audio-visual speech stimuli.

On each trial a single phoneme was presented to the participant (variant A: /ba/ /da/ /ta/ /tha/, variant B: /ba/ /da/ /ga/, variant C: /ba/ /ga/ /ka/ /pa/). Trials began with a fixation cross against a black screen that served as the intertrial interval (ITI), presented for an average of 750 ms (random jitter plus or minus 250 ms, uniformly sampled). In the audiovisual condition, the face appeared either 750 ms before sound onset (task variant B) or 500 ms before sound onset (variants A and C); across all three variants, face motion began at 500 ms before sound onset. In the auditory-alone condition, either the fixation cross persisted until sound onset (variant A) or a uniform gray square (mean contrast of the video images and equal in size) was presented for either 750 ms before sound onset (variant B) or 500 ms before sound onset (variant C). Trials were presented in a random order and phonemes were distributed uniformly across conditions. While conditions were matched in terms of trial numbers, participants completed a variable number of trials (based on task variant and the number of blocks completed): mean = 68 trials per condition (SD = 23, range = 32-96). Onset of each trial was denoted online by a voltage isolated TTL pulse.

All data were resampled to 1024 Hz during initial stages of processing for all participants. Data were referenced in a bipolar fashion (signals subtracted from each immediately adjacent electrode in a pairwise manner) to ensure that the observed signals were derived from maximally local neuronal populations. Only electrodes meeting anatomical criteria within auditory electrodes were included in analyses. Anatomical selection required that an electrode be proximal to an auditory temporal lobe region as defined by the Freesurfer anatomical labels superiortemporal, middletemporal, and supramarginal in MNI space, resulting in 765 bipolar electrode pairs. Excessively noisy electrodes (removed either manually or due to variability in the raw signal greater than 5 SD compared to all electrodes) were removed from analyses, resulting in 745 remaining electrodes; across participants the mean proportion of channels rejected was 3.3% (SD = 8.7%, Range = 0 to 37.5%).

Slow drift artifacts and power-line interference were attenuated by high-pass filtering the data at .1 Hz and notch-filtering at 60 Hz (and its harmonics at 120, 180, and 240 Hz). Each trial was then segmented into a 2-second epoch centered around the onset of the trial. Individual trials were then separately filtered into three frequency ranges using wavelet convolution and then power transformed: theta (3 - 7 Hz, wavelet cycles varied linearly from 3-5), beta (13 - 30 Hz, wavelet cycles varied linearly from 5-10), HGp (70 - 150 Hz in 5 Hz intervals, wavelet cycles = 20 at 70 Hz, and increased linearly to maintain the same wavelet duration across frequencies).

Within each frequency range and evaluated separately at each electrode, we identified outliers in spectral power at each time point that were 3 scaled median absolute deviations from the median trial response. Outlier values were replaced with the appropriate upper or lower threshold value using the 'clip' option of the Matlab command 'filloutliers'. Across participants, a mean of .2% of values were identified as outliers (SD = .1%, Range = .1 to .5%).

Methodology:
A de-identified electrocorticographic (ECoG) dataset obtained from 21 subjects performing an audiovisual task was collected from 4 hospitals. The data is preprocessed according to the descriptions provided in the detailed description document. The data is provided in three distinct frequency bands; Theta (3-7Hz), Beta (13-30Hz) and high gamma power (70-150Hz). The task consisted of phonemes/visemes in auditory alone and congruent audio-visual stimuli. Only data from electrodes in the superiortemporal, middletemporal and supramarginal regions have been included in the dataset. Electrode locations and their corresponding MNI vertices labelled according to freesurfer annotations at different resolutions (1mm, 2mm, 4mm, 10mm and 20mm) are included. Matlab codes to replicate the results in the accompanying manuscript are also included. Version 1.0.1

Instrument and/or Software specifications: Matlab 2019a or higher.

Files contained here:
The dataset zip folder consists of three main sub-folders:

1) Electrodes: This folder provides details regarding the individual electrodes for each subject, their MNI coordinates as well as their MNI vertices information according to freesurfer parcellations. This folder also consists of images of the physical location of each of the electrode sets. The folder is structured in such a way that each subject consists of a separate subfolder within the master folder and within the folder for each subject, there are the bipolar coordinates in the MNI space for each electrode.
This folder also consists of two additional folders apart from one folder each for each of the subjects: a) cvs_avg35_inMNI152 b) label. The cvs_avg35_inMNI152 folder consists of all the parcellations in different resolution levels of a standard MNI brain. The label folder consists of the parcellation labels of a standard MNI brain. Both of these folders are dependencies based on which data processing happens in the main code.

2) Processed: This folder contains preprocessed data in all three frequencies (theta, beta and high gamma power) for individual subjects and the corresponding vertex locations for each of the electrodes from which their data was recorded. The images subfolder also contains figures provided in the main manuscript. The file names are structured in the format FREQUENCYBAND_SUBJECTID_INFO. Here info takes two parameters; either data or vertex, indicating whether a specific file consists of the raw signal data or vertex information. An example of two such files are:

- Beta_1092UC_DATA.mat: This indicates that the file consists of raw data for the subject 1092UC in the beta frequency band.

- HG_1092UC_VERTEX.mat: This indicates that the file consists of vertex information for the subject 1092UC in the beta frequency band.

This folder also consists of ERSP data files for each of the subject which has the file name format ERSP_SUBJECTID_INFO. An example of one such subject is:

- ERSP_1092UC_DATA.mat: This indicates that the file consists of the ERSP data for the subject 1092UC. The ERSP file consists of the raw ERSP data for both the auditory and the congruent audiovisual conditions.

3) MatlabCodes: This folder contains all the matlab scripts required to reproduce the results provided in the main manuscript. LME_AvsAV_Main_Windows.m is the main file that an user has to execute to reproduce the results. For ease of usage, it is suggested that the user unzips the entire folder along with the raw and processed data into the same path and add the path to Matlab's directory using the "setpath" functionality and proceed to utilize the LME_AvsAV_Main_Windows.m file to reproduce results.

Related publication(s):
Ganesan, K., Plass, J., Beltz, A. M., Liu, Z., Grabowecky, M., Suzuki, S., ... & Brang, D. (2020). Visual speech differentially modulates beta, theta, and high gamma bands in auditory cortex. bioRxiv. https://doi.org/10.1101/2020.09.07.284455

Use and Access:
This data set is made available under a Attribution-NonCommercial 4.0 International license (CC BY-NC 4.0)

To Cite Data:
Brang, D., Karthik, G. Electrocorticographic (ECoG) dataset of an audiovisual task [Data set]. University of Michigan - Deep Blue. https://doi.org/10.7302/xhb4-j609

Update Provenance Log Entries

Download All Files (To download individual files, select them in the “Files” panel above)

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to contact us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.