

dataget.kaggle¶

Download any dataset from the Kaggle platform and immediately loads it into memory:

import dataget

df_train, df_test = dataget.kaggle(dataset="cristiangarcia/pointcloudmnist2d").get(
    files=["train.csv", "test.csv"]
)

In this example we downloaded the Point Cloud Mnist 2D dataset from Kaggle and load the train.csv and test.csv files as pandas dataframes.

Config

To start using this Dataset make sure you have properly installed and configured the Kaggle API.

Supported Formats¶

Right now we only support the csv format. In the future we want to be able to load any file that numpy or pandas can read.

API Reference¶

`kaggle`¶

`init(self, dataset=None, competition=None, **kwargs)`¶

Show source code in kaggle.py

    def __init__(self, dataset: str = None, competition: str = None, **kwargs):
        """
        Create a Kaggle dataset. You have to specify either `dataset` or `competition`.

        Arguments:
            dataset: the id of the kaggle dataset in the format `username/dataset_name`.
            competition: the name of the kaggle competition.
            kwargs: common init kwargs.
        """
        assert (
            dataset is not None != competition is not None
        ), "Set either dataset or competition"

        self.kaggle_dataset = dataset
        self.kaggle_competition = competition

        super().__init__(**kwargs)

Create a Kaggle dataset. You have to specify either dataset or competition.

Parameters

Name	Type	Description	Default
`dataset`	`str`	the id of the kaggle dataset in the format `username/dataset_name`.	`None`
`competition`	`str`	the name of the kaggle competition.	`None`
`**kwargs`	`_empty`	common init kwargs.	`{}`

`load(self, files)`¶

Show source code in kaggle.py

    def load(self, files: list):
        """
        Arguments:
            files: the list of files that will be loaded into memory
        """

        return [self._load_file(filename) for filename in files]

Parameters

Name	Type	Description	Default
`files`	`list`	the list of files that will be loaded into memory	required