merlin.dataloader.torch.Loader

class merlin.dataloader.torch.Loader(dataset, batch_size, shuffle=False, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False, transforms=None, device=None)[source]

Bases: torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]

__init__(dataset, batch_size, shuffle=False, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False, transforms=None, device=None)[source]

Methods

`__init__`(dataset, batch_size[, shuffle, …])
`array_lib`()
`convert_batch`(batch)	Returns a batch after it has been converted to the appropriate tensor column type and then formats it in a flat dictionary which makes list columns into values and offsets as separate entries.
`epochs`([epochs])	Create a dataloader that will efficiently run for more than one epoch.
`make_tensors`(gdf[, use_row_lengths])	Yields batches of tensors from a dataframe
`map`(fn)	Applying a function to each batch.
`peek`()	Grab the next batch from the dataloader without removing it from the queue
`stop`()	Halts and resets the initialization parameters of the dataloader.

Attributes

`input_schema`	Get input schema of data to be loaded.
`output_schema`	Get output schema of data being loaded.
`schema`	Get input schema of data to be loaded
`transforms`

peek()[source]: Grab the next batch from the dataloader without removing it from the queue

convert_batch(batch)[source]

Returns a batch after it has been converted to the appropriate tensor column type and then formats it in a flat dictionary which makes list columns into values and offsets as separate entries.

Parameters: batch (tuple) – Tuple of dictionary inputs and n-dimensional array of targets
Returns: A tuple of dictionary inputs, with lists split as values and offsets, and targets as an array
Return type: Tuple

map(fn)[source]

Applying a function to each batch.

This can for instance be used to add sample_weight to the model.