merlin.models.tf.EmbeddingTable#

class merlin.models.tf.EmbeddingTable(*args, **kwargs)[source]#

Bases: merlin.models.tf.inputs.embedding.EmbeddingTableBase

Embedding table that is backed by a standard Keras Embedding Layer. It accepts as input features for lookup tf.Tensor, tf.RaggedTensor, and tf.SparseTensor which might be 2D (batch_size, 1) for scalars or 3d (batch_size, seq_length, 1) for sequential features

dim: Dimension of the dense embedding. col_schema: ColumnSchema

Schema of the column. This is used to infer the cardinality.

embeddings_initializer: Initializer for the embeddings
matrix (see keras.initializers).

embeddings_regularizer: Regularizer function applied to
the embeddings matrix (see keras.regularizers).

embeddings_constraint: Constraint function applied to
the embeddings matrix (see keras.constraints).

mask_zero: Boolean, whether or not the input value 0 is a special “padding”
value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True, then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1).

input_length: Length of input sequences, when it is constant.
This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed).

combiner: A string specifying how to combine embedding results for each: entry (“mean”, “sqrtn” and “sum” are supported) or a layer. Default is None (no combiner used)

trainable: Boolean, whether the layer’s variables should be trainable. name: String name of the layer. dtype: The dtype of the layer’s computations and weights. Can also be a

tf.keras.mixed_precision.Policy, which allows the computation and weight dtype to differ. Default of None means to use tf.keras.mixed_precision.global_policy(), which is a float32 policy unless set to different value.

dynamic: Set this to True if your layer should only be run eagerly, and: should not be used to generate a static computation graph. This would be the case for a Tree-RNN or a recursive network, for example, or generally for any layer that manipulates tensors using Python control flow. If False, we assume that the layer can safely be used to generate a static computation graph.
l2_batch_regularization_factor: float, optional: Factor for L2 regularization of the embeddings vectors (from the current batch only) by default 0.0

**kwargs: Forwarded Keras Layer parameters

__init__(dim: int, *col_schemas: merlin.schema.schema.ColumnSchema, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None, sequence_combiner: Optional[Union[str, keras.engine.base_layer.Layer]] = None, trainable=True, name=None, dtype=None, dynamic=False, table=None, l2_batch_regularization_factor=0.0, weights=None, **kwargs)[source]#: Create an EmbeddingTable.

Methods

`__init__`(dim, *col_schemas[, ...])	Create an EmbeddingTable.
`add_feature`(col_schema)	Add a feature to the table.
`add_loss`(losses, **kwargs)	Add loss tensor(s), potentially dependent on layer inputs.
`add_metric`(value[, name])	Adds metric tensor to the layer.
`add_update`(updates)	Add update op(s), potentially dependent on layer inputs.
`add_variable`(args, *kwargs)	Deprecated, do NOT use! Alias for add_weight.
`add_weight`([name, shape, dtype, ...])	Adds a new variable to the layer.
`as_tabular`([name])
`build`(input_shapes)
`build_from_config`(config)
`call`(inputs, **kwargs)	param inputs Tensors or dictionary of tensors representing the input batch.
`call_outputs`(outputs[, training])
`check_schema`([schema])
`compute_call_output_shape`(input_shapes)
`compute_mask`(inputs[, mask])	Computes an output mask tensor.
`compute_output_shape`(input_shape)
`compute_output_signature`(input_signature)	Compute the output tensor signature of the layer based on the inputs.
`connect`(*block[, block_name, context])	Connect the block to other blocks sequentially.
`connect_branch`(*branches[, add_rest, post, ...])	Connect the block to one or multiple branches.
`connect_debug_block`([append])	Connect the block to a debug block.
`connect_with_residual`(block[, activation])	Connect the block to other blocks sequentially with a residual connection.
`connect_with_shortcut`(block[, ...])	Connect the block to other blocks sequentially with a shortcut connection.
`copy`()
`count_params`()	Count the total number of scalars composing the weights.
`finalize_state`()	Finalizes the layers state after updating layer weights.
`from_config`(config[, table])
`from_dataset`(data[, trainable, name, col_schema])	Create From pre-trained embeddings from a Dataset or DataFrame.
`from_layer`(layer)
`from_pretrained`(data[, trainable, name, ...])	Create From pre-trained embeddings from a Dataset or DataFrame.
`get_build_config`()
`get_config`()
`get_input_at`(node_index)	Retrieves the input tensor(s) of a layer at a given node.
`get_input_mask_at`(node_index)	Retrieves the input mask tensor(s) of a layer at a given node.
`get_input_shape_at`(node_index)	Retrieves the input shape(s) of a layer at a given node.
`get_item_ids_from_inputs`(inputs)
`get_output_at`(node_index)	Retrieves the output tensor(s) of a layer at a given node.
`get_output_mask_at`(node_index)	Retrieves the output mask tensor(s) of a layer at a given node.
`get_output_shape_at`(node_index)	Retrieves the output shape(s) of a layer at a given node.
`get_padding_mask_from_item_id`(inputs[, ...])
`get_weights`()	Returns the current weights of the layer, as NumPy arrays.
`parse`(*block)
`parse_block`(input)
`prepare`([block, post, aggregation])	Transform the inputs of this block.
`register_features`(feature_shapes)
`repeat`([num])	Repeat the block num times.
`repeat_in_parallel`([num, prefix, names, ...])	Repeat the block num times in parallel.
`select_by_name`(name)
`select_by_tag`(tags)	Select features in EmbeddingTable by tags.
`set_schema`([schema])
`set_weights`(weights)	Sets the weights of the layer, from NumPy arrays.
`to_dataset`([gpu])
`to_df`([gpu])
`with_name_scope`(method)	Decorator to automatically enter the module name scope.

Attributes

`REQUIRES_SCHEMA`
`activity_regularizer`	Optional regularizer function for the output of this layer.
`compute_dtype`	The dtype of the layer's computations.
`context`
`dtype`	The dtype of the layer weights.
`dtype_policy`	The dtype policy associated with this layer.
`dynamic`	Whether the layer is dynamic (eager-only); set in the constructor.
`has_schema`
`inbound_nodes`	Return Functional API nodes upstream of this layer.
`input`	Retrieves the input tensor(s) of a layer.
`input_dim`
`input_mask`	Retrieves the input mask tensor(s) of a layer.
`input_shape`	Retrieves the input shape(s) of a layer.
`input_spec`	InputSpec instance(s) describing the input format for this layer.
`losses`	List of losses added using the add_loss() API.
`metrics`	List of metrics added using the add_metric() API.
`name`	Name of the layer (string), set in the constructor.
`name_scope`	Returns a tf.name_scope instance for this class.
`non_trainable_variables`
`non_trainable_weights`	List of all non-trainable weights tracked by this layer.
`outbound_nodes`	Return Functional API nodes downstream of this layer.
`output`	Retrieves the output tensor(s) of a layer.
`output_mask`	Retrieves the output mask tensor(s) of a layer.
`output_shape`	Retrieves the output shape(s) of a layer.
`registry`
`schema`
`stateful`
`submodules`	Sequence of all sub-modules.
`supports_masking`	Whether this layer supports computing a mask using compute_mask.
`table_name`
`trainable`
`trainable_variables`
`trainable_weights`	List of all trainable weights tracked by this layer.
`updates`
`variable_dtype`	Alias of Layer.dtype, the dtype of the weights.
`variables`	Returns the list of all layer variables/weights.
`weights`	Returns the list of all layer variables/weights.

select_by_tag(tags: Union[merlin.schema.tags.Tags, Sequence[merlin.schema.tags.Tags]]) → Optional[merlin.models.tf.inputs.embedding.EmbeddingTable][source]#

Select features in EmbeddingTable by tags.

Since an EmbeddingTable can be a shared-embedding table, this method filters the schema for features that match the tags.

If none of the features match the tags, it will return None.

Parameters: tags (Union[Tags, Sequence[Tags]]) – A list of tags.
Return type: An EmbeddingTable if the tags match. If no features match, it returns None.

classmethod from_pretrained(data: Union[merlin.io.dataset.Dataset, pandas.core.frame.DataFrame], trainable=True, name=None, col_schema=None, **kwargs) → merlin.models.tf.inputs.embedding.EmbeddingTable[source]#: Create From pre-trained embeddings from a Dataset or DataFrame. :param data: A dataset containing the pre-trained embedding weights :type data: Union[Dataset, DataFrameType] :param trainable: Whether the layer should be trained or not. :type trainable: bool :param name: The name of the layer. :type name: str

classmethod from_dataset(data: Union[merlin.io.dataset.Dataset, pandas.core.frame.DataFrame], trainable=True, name=None, col_schema=None, **kwargs) → merlin.models.tf.inputs.embedding.EmbeddingTable[source]#: Create From pre-trained embeddings from a Dataset or DataFrame. :param data: A dataset containing the pre-trained embedding weights :type data: Union[Dataset, DataFrameType] :param trainable: Whether the layer should be trained or not. :type trainable: bool :param name: The name of the layer. :type name: str

to_dataset(gpu=None) → merlin.io.dataset.Dataset[source]#

to_df(gpu=None)[source]#

build(input_shapes)[source]#

call(inputs: Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]], **kwargs) → Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]][source]#

Parameters: inputs (Union[tf.Tensor, tf.RaggedTensor, tf.SparseTensor]) – Tensors or dictionary of tensors representing the input batch.
Return type: A tensor or dict of tensors corresponding to the embeddings for inputs

compute_output_shape(input_shape: Union[tensorflow.python.framework.tensor_shape.TensorShape, Dict[str, tensorflow.python.framework.tensor_shape.TensorShape]]) → Union[tensorflow.python.framework.tensor_shape.TensorShape, Dict[str, tensorflow.python.framework.tensor_shape.TensorShape]][source]#

compute_call_output_shape(input_shapes)[source]#

classmethod from_config(config, table=None)[source]#

get_config()[source]#