ASVSpoof Laundered Database

Ali, Hashim; Subramani, Surya; Sudhir, Shefali; Varahamurthy, Raksha; Malik, Hafiz

Work Description

Title: ASVSpoof Laundered Database Open Access Deposited

Attribute	Value
Methodology	This database is based on the ASVspoof 2019 logical access (LA) eval partition. The Asvspoof 2019 LA eval database is passed through five different types of additive noise at three different Signal-to-Noise ratio (SNR) levels, three types of reverberation noise, six different re-compression rates, four different resampling factors, and one type of low pass filtering accumulating to a total of 1388.22 hours of audio data. Additive Noise includes white noise, babble noise, volvo noise, street noise, and cafe noise. Three reverberation types include reverberation with reverberation time (RT60) equal to 0.3, 0.6, and 0.9 secs. Recompression rates include compression at bit rates of 16, 64, 128, 192, 256, and 320 Kbits/s. Resampling rates include 8 KHz, 11 KHz, 22 KHz, and 44 KHz.
Description	Voice-cloning (VC) systems have seen an exceptional increase in the realism of synthesized speech in recent years. The high quality of synthesized speech and the availability of low-cost VC services have given rise to many potential abuses of this technology such as online smearing campaigns and dissemination of fabricated information etc. A number of detection methodologies have been proposed over the years that can detect voice spoofs with reasonably good accuracy. However, these methodologies are mostly evaluated on clean audio databases, such as Asvspoof 2019. This research aims to evaluate state-of-the-art (SOTA) Audio Spoof Detection approaches in the presence of laundering attacks. In that regard, a new laundering attack database, called ASVspoof Laundering Database, is created. This database is based on the ASVspoof 2019 LA eval database comprising a total of 1388.22 hours of audio recordings. Seven SOTA audio spoof detection approaches are evaluated on this laundered database. The results indicate that SOTA systems perform poorly in the presence of aggressive laundering attacks, especially reverberation and additive noise attacks. This suggests the need for robust audio spoof detection.
Creator	Ali, Hashim Subramani, Surya Sudhir, Shefali Varahamurthy, Raksha Malik, Hafiz
Creator ORCID iD	https://orcid.org/0000-0003-0532-0268
Depositor	alhashim@umich.edu
Contact information	alhashim@umich.edu
Discipline	Engineering
Keyword	Audio Forensics Audio Antispoofing Audio Deepfakes ASVSpoof Machine Learning
Date coverage	2024-02-10
Resource type	Dataset
Last modified	06/04/2024
Published	06/04/2024
Language	English
DOI	https://doi.org/10.7302/za3p-5005
License	http://creativecommons.org/licenses/by/4.0/

To Cite this Work:
Ali, H., Subramani, S., Sudhir, S., Varahamurthy, R., Malik, H. (2024). ASVSpoof Laundered Database [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/za3p-5005

Relationships


This work is not a member of any user collections.

Files (Count: 3; Size: 120 GB)

Title	Original Upload	Last Modified	File Size	Access	Actions
Readme.txt	2024-05-22	2024-06-04	4.54 KB	Open Access	View Details Download
flac.zip	2024-05-22	2024-05-22	120 GB	Open Access	View Details Download Data from Globus
protocols.zip	2024-05-22	2024-05-22	14.2 MB	Open Access	View Details Download

========================================================================================================
ASVspoof Laundered Database: This database is based on ASVspoof 2019 logical access (LA) eval partition.

The Asvspoof 2019 LA eval database is passed through five different types of additive noise at three
different Signal-to-Noise ratio (SNR) levels, three types of reverberation noise, six different re-compression rates, four
different resampling factors, and one type of low pass filtering accumulating to a total of 1388.22
hours of audio data.

Dataset Creators: Hashim Ali, Surya Subramani, Shefali Sudhir, Raksha Varahamurthy and Hafiz Malik

Dataset Contact: Hashim Ali alhashim@umich.edu

Date Written: 05/29/2024

*** WARNING ***:
The 'flac' folder contains over 2 million (2065873) files. Open this folder at your own risk.

========================================================================================================

1. Directory Structure
_______________________

--> ASVspoofLauneredDatabase
--> flac
--> protocols
--> Readme.txt

2. Description of the audio files
_________________________________

The directory flac contain audio files for each type of laundering attack, namely, Noise_Addition, Reverberation, Recompression, Resampling, and Filtering. Each laundering
attack (i) has different parameters (j) which are described below in the protocols section. All audio files in this directory are in the flac format.

3. Description of the protocols
_______________________________

The directory protocols contains five protocol files, one for each laundering attack.

Each column of the protocol is formatted as:

SPEAKER_ID AUDIO_FILE_NAME SYSTEM_ID KEY Laundering_Type Laundering_Param

1) SPEAKER_ID: LA_****, a 4-digit speaker ID
2) AUDIO_FILE_NAME: LA_****, name of the audio file
3) SYSTEM_ID: ID of the speech spoofing system (A01 - A19), or, for bonafide speech SYSTEM-ID is left blank ('-')
4) KEY: 'bonafide' for genuine speech, or, 'spoof' for spoofing speech
5) Laundering_Type Type of laundering attack. One of 'Noise_Addition', 'Reverberation', 'Recompression', 'Resampling', and 'Filtering'
6) Laundering_Param Parameters for the laundering attack. For example, in the case of Noise_Addition, it can be 'babble_0' where babble is the type of
additive noise and 0 is the SNR level at which the babble noise is added to the audio signal.

Note that:

1) the first four columns are the same as in ASVspoof2019_LA_cm_protocols (refer to the ASVspoof2019 database), where the fourth in the original database
is omitted since it is NOT used for LA.
2) Brief description on the Laundering_Param:

babble_0 babble noise at SNR level of 0
babble_10 babble noise at SNR level of 10
babble_20 babble noise at SNR level of 20
cafe_0 cafe noise at SNR level of 0
cafe_10 cafe noise at SNR level of 10
cafe_20 cafe noise at SNR level of 20
street_0 street noise at SNR level of 0
street_10 street noise at SNR level of 10
street_20 street noise at SNR level of 20
volvo_0 volvo noise at SNR level of 0
volvo_10 volvo noise at SNR level of 10
volvo_20 volvo noise at SNR level of 20
white_0 white noise at SNR level of 0
white_10 white noise at SNR level of 10
white_20 white noise at SNR level of 20
RT_0_3 Reverberation with RT60 = 0.3 sec
RT_0_6 Reverberation with RT60 = 0.6 sec
RT_0_9 Reverberation with RT60 = 0.9 sec
recompression_128k Compression using bit rate of 128 kbit/s
recompression_16k Compression using bit rate of 16 kbit/s
recompression_196k Compression using bit rate of 196 kbit/s
recompression_256k Compression using bit rate of 256 kbit/s
recompression_320k Compression using bit rate of 320 kbit/s
recompression_64k Compression using bit rate of 64 kbit/s
resample_11025 resampling rate of 11025 Hz
resample_22050 resampling rate of 22050 Hz
resample_44100 resampling rate of 44100 Hz
resample_8000 resampling rate of 8000 Hz
lpf_7000 low pass filtering with a cut-off frequency of 7 Khz

Update Provenance Log Entries

Download All Files (To download individual files, select them in the “Files” panel above)

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to contact us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.