dataget.image.imagenet
Downloads the ImageNet dataset from their official ImageNet Object Localization Challenge Kaggle competition and loads its metadata as pandas
dataframes. You need the Kaggle CLI installed and configured to use this dataset.
import dataget
df_train, df_val, df_test = dataget.image.imagenet().get()
Dataget doesn't load the images of this dataset into memory, instead the df_train
, df_val
, and df_test
dataframes has the image_path
column which contains the relative path of each sample which you can latter use to iteratively load each image during training.
Sample
|
type |
shape |
df_train |
pd.DataFrame |
(544_546, 10) |
df_val |
pd.DataFrame |
(50_000, 9) |
df_test |
pd.DataFrame |
(100_000, 2) |
Features
column |
type |
description |
df_train |
df_val |
df_test |
ImageId |
str |
image id |
x |
x |
x |
image_path |
str |
relative path to jpeg image |
x |
x |
x |
annotations_path |
str |
relative path to pascal voc xml |
x |
x |
|
label |
str |
label id |
x |
x |
|
label_name |
str |
label name |
x |
x |
|
PredictionString |
str |
prediction string |
x |
x |
|
xmin |
int64 |
prediction string bouding box coord |
x |
x |
|
ymin |
int64 |
prediction string bouding box coord |
x |
x |
|
xmax |
int64 |
prediction string bouding box coord |
x |
x |
|
ymax |
int64 |
prediction string bouding box coord |
x |
x |
|
wnid |
str |
WordNet ID |
x |
|
|
Info
- Folder name:
image_imagenet
- Size on disk:
161GB