merlin.dataloader.loader_base.ChunkQueue

class merlin.dataloader.loader_base.ChunkQueue(dataloader, qsize, num_parts=1, shuffle=False, put_wait=1e-06, epochs=1)[source]

Bases: object

This class takes partitions (parts) from an merlin.io.Dataset and concatenates them into a cudf dataframe “chunk.” This chunk is subsequently transformed into its tensor representation using the iterator’s transform.

Parameters
  • qsize (int) – Maximum number of elements to hold in the buffer at one time.

  • num_parts (int) – Number of partitions from the iterator, a merlin.io.Dataset to concatenate into a “chunk.”

  • shuffle (bool) – Enable or disable chunk-level shuffling.

  • put_wait (float) – Specifies the timeout to wait for a full queue to open up before checking for errors and trying again.

__init__(dataloader, qsize, num_parts=1, shuffle=False, put_wait=1e-06, epochs=1)[source]

Methods

__init__(dataloader, qsize[, num_parts, …])

batch(itr)

Iterates through gpu_mem_frac size chunks of dataset and concatenates every num_parts of them.

chunk_logic(itr)

get()

get_batch_div_chunk(chunks, batch_size)

load_chunks(dev)

put(packet)

start()

stop()

Attributes

empty

stopped

property stopped
property empty
get()[source]
put(packet)[source]
batch(itr)[source]

Iterates through gpu_mem_frac size chunks of dataset and concatenates every num_parts of them.

chunk_logic(itr)[source]
load_chunks(dev)[source]
stop()[source]
start()[source]
get_batch_div_chunk(chunks, batch_size)[source]