merlin.dataloader.torch.Loader
-
class
merlin.dataloader.torch.Loader(dataset, batch_size, shuffle=False, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False, transforms=None, device=None)[source] Bases:
torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]-
__init__(dataset, batch_size, shuffle=False, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False, transforms=None, device=None)[source]
Methods
__init__(dataset, batch_size[, shuffle, …])array_lib()convert_batch(batch)Returns a batch after it has been converted to the appropriate tensor column type and then formats it in a flat dictionary which makes list columns into values and offsets as separate entries.
epochs([epochs])Create a dataloader that will efficiently run for more than one epoch.
make_tensors(gdf[, use_row_lengths])Yields batches of tensors from a dataframe
map(fn)Applying a function to each batch.
peek()Grab the next batch from the dataloader without removing it from the queue
stop()Halts and resets the initialization parameters of the dataloader.
Attributes
input_schemaGet input schema of data to be loaded.
output_schemaGet output schema of data being loaded.
schemaGet input schema of data to be loaded
transforms-
convert_batch(batch)[source] Returns a batch after it has been converted to the appropriate tensor column type and then formats it in a flat dictionary which makes list columns into values and offsets as separate entries.
- Parameters
batch (tuple) – Tuple of dictionary inputs and n-dimensional array of targets
- Returns
A tuple of dictionary inputs, with lists split as values and offsets, and targets as an array
- Return type
Tuple
-