merlin.dataloader.loader_base.LoaderBase
-
class
merlin.dataloader.loader_base.
LoaderBase
(dataset, batch_size, shuffle=False, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False, transforms=None, device=None)[source] Bases:
object
Base class containing common functionality between the PyTorch and TensorFlow dataloaders.
-
__init__
(dataset, batch_size, shuffle=False, seed_fn=None, parts_per_chunk=1, global_size=None, global_rank=None, drop_last=False, transforms=None, device=None)[source]
Methods
__init__
(dataset, batch_size[, shuffle, …])epochs
([epochs])Create a dataloader that will efficiently run for more than one epoch.
make_tensors
(gdf[, use_row_lengths])Yields batches of tensors from a dataframe
peek
()Get the next batch without advancing the iterator.
stop
()Halts and resets the initialization parameters of the dataloader.
Attributes
Get input schema of data to be loaded.
Get output schema of data being loaded.
Get input schema of data to be loaded
-
property
transforms
-
epochs
(epochs=1)[source] Create a dataloader that will efficiently run for more than one epoch.
- Parameters
epochs (int, optional) – Number of epochs the dataloader should process data, by default 1
- Returns
return a dataloader that will run for user defined epochs.
- Return type
DataLoader
-
make_tensors
(gdf, use_row_lengths=False)[source] Yields batches of tensors from a dataframe
- Parameters
gdf (DataFrame) – A dataframe type object.
use_row_lengths (bool, optional) – Enable using row lengths instead of offsets for list columns, by default False
- Returns
A dictionary of the column tensor representations.
- Return type
Dict[Tensors]
-
property
schema
Get input schema of data to be loaded
- Returns
Schema corresponding to the data
- Return type
-
property
output_schema
Get output schema of data being loaded.
When there are transforms defined that change the features being loaded, This output schema is intended to account for this and should match the features returned by the loader. If there are no transforms then this will be the same as the input schema.
- Returns
Schema corresponding to the data that will be output by the loader
- Return type
-