merlin.loader.loader_base.LoaderBase
-
class
merlin.loader.loader_base.LoaderBase(dataset, batch_size, shuffle, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False)[source] Bases:
objectBase class containing common functionality between the PyTorch and TensorFlow dataloaders.
-
__init__(dataset, batch_size, shuffle, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False)[source]
Methods
__init__(dataset, batch_size, shuffle[, …])epochs([epochs])Create a dataloader that will efficiently run for more than one epoch.
make_tensors(gdf[, use_nnz])Turns a gdf into tensor representation by column
stop()Halts and resets the initialization parameters of the dataloader.
-
epochs(epochs=1)[source] Create a dataloader that will efficiently run for more than one epoch.
- Parameters
epochs (int, optional) – Number of epochs the dataloader should process data, by default 1
- Returns
return a dataloader that will run for user defined epochs.
- Return type
DataLoader
-
make_tensors(gdf, use_nnz=False)[source] Turns a gdf into tensor representation by column
- Parameters
gdf (DataFrame) – A dataframe type object.
use_nnz (bool, optional) – toggle nnzs or use offsets for list columns, by default False
- Returns
A dictionary of the column tensor representations.
- Return type
Dict[Tensors]
-