merlin.dataloader.torch.Loader
-
class
merlin.dataloader.torch.
Loader
(dataset, batch_size, shuffle=False, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False, transforms=None, device=None)[source] Bases:
torch.utils.data.dataset.Dataset
[torch.utils.data.dataset.T_co
]-
__init__
(dataset, batch_size, shuffle=False, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False, transforms=None, device=None)[source]
Methods
__init__
(dataset, batch_size[, shuffle, …])array_lib
()convert_batch
(batch)Returns a batch after it has been converted to the appropriate tensor column type and then formats it in a flat dictionary which makes list columns into values and offsets as separate entries.
epochs
([epochs])Create a dataloader that will efficiently run for more than one epoch.
make_tensors
(gdf[, use_row_lengths])Yields batches of tensors from a dataframe
map
(fn)Applying a function to each batch.
peek
()Grab the next batch from the dataloader without removing it from the queue
stop
()Halts and resets the initialization parameters of the dataloader.
Attributes
input_schema
Get input schema of data to be loaded.
output_schema
Get output schema of data being loaded.
schema
Get input schema of data to be loaded
transforms
-
convert_batch
(batch)[source] Returns a batch after it has been converted to the appropriate tensor column type and then formats it in a flat dictionary which makes list columns into values and offsets as separate entries.
- Parameters
batch (tuple) – Tuple of dictionary inputs and n-dimensional array of targets
- Returns
A tuple of dictionary inputs, with lists split as values and offsets, and targets as an array
- Return type
Tuple
-