Work Description

Title: The KSU-UMD Dataset for Benchmarking for Audio Forensic Algorithms Open Access Deposited
Attribute Value
  • This dataset contains 660 sound files. It was collected by 5 speakers in 4 different languages. The duration of each file in the dataset is nearly 3 minutes, where the first minute in each is silent. The recorded speech files are distributed as follows, 264 are recorded for 2 Arabic speakers, 132 are recorded for an English speaker, 132 are recorded for a Chinese speaker and the last 132 files are recorded for an Indonesian speaker. The mixer Zoom R16 and 22 different microphones were used to collect this dataset in six different environments. In all recording sessions, we keep a distance of 20 cm between the speakers and the microphones.
  • Details of the microphone used for data collection, acoustic environment in which data was collected, and naming convention used are provided here. 1 - Microphones Used: The microphones used to collect this dataset belong to 7 different trademarks. Table (1) illustrates the number of used Mics of different trademarks and models. Table 1: Trademarks and models of Mics Mic Trademark Mic Model # of Mics Shure SM-58 3 Electro-Voice RE-20 2 Sennheiser MD-421 3 AKG C 451 2 AKG C 3000 B 2 Neumann KM184 2 Coles 4038 2 The t.bone MB88U 6 Total 22 2- Environment Description: A brief description of the 6 environments in which the dataset was collected is presented here: (i) Soundproof room: a small room (nearly 1.5m × 1.5m × 2m), which is closed and completely isolated. With an exception of a small window in the front side of the room which is made of glass, all the walls of the room are made of wood and covered by a layer of sponge from the inner side, and the floor is covered by carpet. (ii) Class room: standard class room (6m × 5m × 3m). (iii) Lab: small lab (4m × 4m × 3m). All the walls are made of glasses and the floor is covered by carpet. The lab contains 9 computers. (iv) Stairs: is in the second floor. The place of recording is 3m × 5m (v) Parking: is the college parking. (vi) Garden: is an open space outside the buildings. 3- Naming Convention: This set of rules were followed as a naming convention to give each file in the dataset a unique name: (i) The file name is 19 characters long, and consists of 5 sections separated by underscores. (ii) The first section is of 3 characters indicates the Microphone trademark. (iii) The second section of 4 characters indicates the microphone model as in table (2). (iv) The third section of 2 characters indicates a specific microphone within a set of microphones of the same trademark and model, since we have more than one microphone of the same trademark and model. (v) The fourth section of 2 characters indicates the environment, where Soundproof room --> 01 Class room --> 02 Lab --> 03 Stairs --> 04 Parking --> 05 Garden --> 06 (vi) The fifth section of 2 characters indicates the language, where Arabic --> 01 English --> 02 Chinese --> 03 Indonesian --> 04 (vii) The sixth section of 2 characters indicates the speaker. Table 2: Microphones Naming Criteria Original Mic Trademark and model --> Naming Convenient Shure SM-58 --> SHU_0058 Electro-Voice RE-20 --> ELE_0020 Sennheiser MD-421 --> SEN_0421 AKG C 451 --> AKG_0451 AKG C 3000 B --> AKG_3000 Neumann KM184 --> NEU_0184 Coles 4038 --> COL_4038 The t.bone MB88U --> TBO_0088 For example: SEN_0421_02_01_02_03 is an English file recorded by speaker number 3 in the soundproof room using microphone number 2 of Sennheiser MD-421
Contact information
Funding agency
  • National Science Foundation (NSF)
Other Funding agency
  • National Science Foundation (NSF)
ORSP grant number
  • 14-PAF05460
Date coverage
  • 2014
Citations to related material
Resource type
Last modified
  • 06/04/2018
  • doi:10.7302/Z2RJ4GCC
CC License


Files (Count: 666; Size: 10.3 GB)