All2All Dense Embedding
- class sparse_operation_kit.embeddings.all2all_dense_embedding.All2AllDenseEmbedding(*args, **kwargs)[source]
Bases:
Layer
Abbreviated as
sok.All2AllDenseEmbedding(*args, **kwargs)
.This is a wrapper class for all2all dense embedding layer. It can be used to create a dense embedding layer which will distribute keys based on gpu_id = key % gpu_num to each GPU.
- Parameters
max_vocabulary_size_per_gpu (integer) – the first dimension of embedding variable whose shape is [max_vocabulary_size_per_gpu, embedding_vec_size].
embedding_vec_size (integer) – the second dimension of embedding variable whose shape is [max_vocabulary_size_per_gpu, embedding_vec_size].
slot_num (integer) – the number of feature-fileds which will be processed at the same time in each iteration, where all feature-fileds produce embedding vectors of the same dimension.
nnz_per_slot (integer) – the number of valid keys in each slot. The number of valid keys in each slot is the same.
dynamic_input (boolean = False) – whether the inputs.shape is dynamic. For example, the inputs tensor is comming from tf.unique. When dynamic_input=True, unique->lookup->gather pattern can be used. By default, it is False, which means the inputs.size must be replica_batchsize * slot_num * nnz_per_slot.
use_hashtable (boolean = True) – whether using Hashtable in
EmbeddingVariable
, if True, Hashtable will be created for dynamic insertion. Otherwise, the input keys will be used as the index for embedding vector looking-up, so that input keys must be in the range[0, max_vocabulary_size_per_gpu * gpu_num)
.key_dtype (tf.dtypes = tf.int64) – the data type of input keys. By default, it is tf.int64.
embedding_initializer (string or an instance of tf.keras.initializers.Initializer) – the initializer used to generate initial value for embedding variable. By default, it will use random_uniform where
minval=-0.05, maxval=0.05
.
Examples
initializer = tf.keras.initializers.RandomUniform() # or "random_uniform" emb_layer = sok.All2AllDenseEmbedding(max_vocabulary_size_per_gpu, embedding_vec_size, slot_num, nnz_per_slot, embedding_initializer=initializer) @tf.function def _train_step(inputs, labels): emb_vectors = emb_layer(inputs) ... for i, (inputs, labels) in enumerate(dataset): _train_step(inputs)
- call(inputs, training=True)[source]
The forward logic of this wrapper class.
- Parameters
inputs (tf.Tensor) – keys are stored in tf.Tensor. It must be stored in row-major. If dynamic_input = True, then inputs.shape must be [None,], otherwise, inputs.shape must be [batchsize, slot_num, nnz_per_slot].
training (boolean) – whether training or not.
- Returns
emb_vector – the embedding vectors for the input keys. When dynamic_input=False, its shape is [batchsize, slot_num, nnz_per_slot, embedding_vec_size]. Otherwise, its shape is [None, embedding_vec_size], where None equals to the size of inputs.
- Return type
tf.float