TF Distributed Embedding
Wrapper classes to build model-parallel embedding layer totally with TensorFlow’s API. It utilizes tf.distribute.Strategy to do the communication among different GPUs.
- class sparse_operation_kit.embeddings.tf_distributed_embedding.TFDistributedEmbedding(*args, **kwargs)[source]
This Embedding layer will distribute embedding parameters to multiple GPUs. It leverages tf.distribute.Strategy to do the communication, so that tf.distribute.Strategy must be used.
- Parameters
vocabulary_size (integer) – the first dimension of variable whose shape is [vocabulary_size, embedding_vec_size].
embedding_vec_size (integer) – the second dimension of variable whose shape is [vocabulary_size, embedding_vec_size].
initializer (string, numpy.array = 'GlorotNormal') – When it’s string, it specifies the initializer used to generate initial values. When it’s numpy.array, its shape must be [vocabulary_size, embedding_vec_size], and will be used as the initial value.
comm_options (tf.distribute.experimental.CommunicationOptions = None) – see TF’s docs
Examples
strategy = ... with strategy.scope(): embedding_layer = TFDistributedEmbedding(vocabulary_size, embedding_vec_size, initializer) ... @tf.function def _train_step(inputs, labels): emb_vectors = embedding_layer(inputs) for i, (inputs, labels) in enumerate(dataset): strategy.run(_train_step, args=(inputs, labels))
Notes
Currently, the variables created by this class can not be correctly saved to files.