

dataget.audio.free_spoken_digit¶

Downloads the Free Spoken Digits dataset and loads its metadata as pandas dataframes. The audio samples are as .wav files.

import dataget

df = dataget.audio.free_spoken_digit().get()

Dataget doesn't load the audio dataset into memory, instead the df dataframe has the audio_path column which contains the relative path of each sample. You can easily load them using scipy.io.wavfile.read.

Tip

Its recommended that you split train / test based on user instead of randomly to avoid testing based on similar samples found in training.

Format¶

	type	shape
df	pd.DataFrame	`(2_000, 4)`

Features¶

column	type	description
audio_path	`str`	Relative path of the audio file
label	`int64`	Target label in the range `[0, 9]`
user	`str`	Name of the speaker
repetition	`int64`	Repetition number for each (user, label) pair, i.e. each user repeats each digit multiple times

Info¶

Folder name: audio_free_spoken_digit
Size on disk: 26MB