dataget.audio.free_spoken_digit¶
Downloads the Free Spoken Digits dataset and loads its metadata as pandas
dataframes. The audio samples are as .wav
files.
import dataget df = dataget.audio.free_spoken_digit().get()
df
dataframe has the audio_path
column which contains the relative path of each sample
. You can easily load them using scipy.io.wavfile.read.
Tip
Its recommended that you split train / test based on user
instead of randomly to avoid testing based on similar samples found in training.
Format¶
type | shape | |
---|---|---|
df | pd.DataFrame | (2_000, 4) |
Features¶
column | type | description |
---|---|---|
audio_path | str |
Relative path of the audio file |
label | int64 |
Target label in the range [0, 9] |
user | str |
Name of the speaker |
repetition | int64 |
Repetition number for each (user, label) pair, i.e. each user repeats each digit multiple times |
Info¶
- Folder name:
audio_free_spoken_digit
- Size on disk:
26MB