dataget.image.imagenet

Downloads the ImageNet dataset from their official ImageNet Object Localization Challenge Kaggle competition and loads its metadata as pandas dataframes. You need the Kaggle CLI installed and configured to use this dataset.

import dataget

df_train, df_val, df_test = dataget.image.imagenet().get()
Dataget doesn't load the images of this dataset into memory, instead the df_train, df_val, and df_test dataframes has the image_path column which contains the relative path of each sample which you can latter use to iteratively load each image during training.

Sample

imagenet-sample

Format

type shape
df_train pd.DataFrame (544_546, 10)
df_val pd.DataFrame (50_000, 9)
df_test pd.DataFrame (100_000, 2)

Features

column type description df_train df_val df_test
ImageId str image id x x x
image_path str relative path to jpeg image x x x
annotations_path str relative path to pascal voc xml x x
label str label id x x
label_name str label name x x
PredictionString str prediction string x x
xmin int64 prediction string bouding box coord x x
ymin int64 prediction string bouding box coord x x
xmax int64 prediction string bouding box coord x x
ymax int64 prediction string bouding box coord x x
wnid str WordNet ID x

Info

  • Folder name: image_imagenet
  • Size on disk: 161GB