merlin.models.utils.misc_utils.validate_dataset#

merlin.models.utils.misc_utils.validate_dataset(paths_or_dataset, batch_size, buffer_size, engine, reader_kwargs)[source]#

Util function to load NVTabular Dataset from disk

Parameters
  • paths_or_dataset (Union[nvtabular.Dataset, str]) – Path to dataset to load of nvtabular Dataset, if Dataset, return the object.

  • batch_size (int) – batch size for Dataloader.

  • buffer_size (float) – parameter, which refers to the fraction of batches to load at once.

  • engine (str) – parameter to specify the file format, possible values are: [“parquet”, “csv”, “csv-no-header”].

  • reader_kwargs (dict) – Additional arguments of the specified reader.