Distributed Sparse Embedding
- class sparse_operation_kit.embeddings.distributed_embedding.DistributedEmbedding(*args, **kwargs)[source]
- Bases: - Layer- Abbreviated as - sok.DistributedEmbedding(*args, **kwargs).- This is a wrapper class for distributed sparse embedding layer. It can be used to create a sparse embedding layer which will distribute keys based on gpu_id = key % gpu_num to each GPU. - Parameters
- combiner (string) – it is used to specify how to combine embedding vectors intra slots. Can be Mean or Sum. 
- max_vocabulary_size_per_gpu (integer) – the first dimension of embedding variable whose shape is [max_vocabulary_size_per_gpu, embedding_vec_size]. 
- embedding_vec_size (integer) – the second dimension of embedding variable whose shape is [max_vocabulary_size_per_gpu, embedding_vec_size]. 
- slot_num (integer) – the number of feature-fileds which will be processed at the same time in each iteration, where all feature-fileds produce embedding vectors of the same dimension. 
- max_nnz (integer) – the number of maximum valid keys in each slot (feature-filed). 
- max_feature_num (integer = slot_num*max_nnz) – the maximum valid keys in each sample. It can be used to save GPU memory when this statistic is known. By default, it is equal to \(max\_feature\_num=slot\_num*max\_nnz\). 
- use_hashtable (boolean = True) – whether using Hashtable in - EmbeddingVariable, if True, Hashtable will be created for dynamic insertion. Otherwise, the input keys will be used as the index for embedding vector looking-up, so that input keys must be in the range- [0, max_vocabulary_size_per_gpu * gpu_num).
- key_dtype (tf.dtypes = tf.int64) – the data type of input keys. By default, it is tf.int64. 
- embedding_initializer (string or an instance of tf.keras.initializers.Initializer) – the initializer used to generate initial value for embedding variable. By default, it will use random_uniform where - minval=-0.05, maxval=0.05.
 
 - Examples - initializer = tf.keras.initializers.RandomUniform() # or "random_uniform" emb_layer = sok.DistributedEmbedding(combiner, max_vocabulary_size_per_gpu, embedding_vec_size, slot_num, max_nnz, embedding_initializer=initializer) @tf.function def _train_step(inputs, labels): emb_vectors = emb_layer(inputs) ... for i, (inputs, labels) in enumerate(dataset): _train_step(inputs) - call(inputs, training=True)[source]
- The forward logic of this wrapper class. - Parameters
- inputs (tf.sparse.SparseTensor) – keys are stored in SparseTensor.values. SparseTensor.dense_shape is 2-dim and denotes [batchsize * slot_num, max_nnz]. Therefore, the rank of SparseTensor.indices must be 2 which denotes [row-indices, column-indices] in the corresponding dense tensor. 
- training (boolean) – whether training or not. 
 
- Returns
- emb_vector – the embedding vectors for the input keys. Its shape is [batchsize, slot_num, embedding_vec_size] 
- Return type
- tf.float