merlin.models.tf.EmbeddingTable#
- class merlin.models.tf.EmbeddingTable(*args, **kwargs)[source]#
Bases:
merlin.models.tf.inputs.embedding.EmbeddingTableBase
Embedding table that is backed by a standard Keras Embedding Layer.
dim: Dimension of the dense embedding. col_schema: ColumnSchema
Schema of the column. This is used to infer the cardinality.
- embeddings_initializer: Initializer for the embeddings
matrix (see keras.initializers).
- embeddings_regularizer: Regularizer function applied to
the embeddings matrix (see keras.regularizers).
- embeddings_constraint: Constraint function applied to
the embeddings matrix (see keras.constraints).
- mask_zero: Boolean, whether or not the input value 0 is a special “padding”
value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True, then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1).
- input_length: Length of input sequences, when it is constant.
This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed).
- combiner: A string specifying how to combine embedding results for each
entry (“mean”, “sqrtn” and “sum” are supported) or a layer. Default is None (no combiner used)
trainable: Boolean, whether the layer’s variables should be trainable. name: String name of the layer. dtype: The dtype of the layer’s computations and weights. Can also be a
tf.keras.mixed_precision.Policy, which allows the computation and weight dtype to differ. Default of None means to use tf.keras.mixed_precision.global_policy(), which is a float32 policy unless set to different value.
- dynamic: Set this to True if your layer should only be run eagerly, and
should not be used to generate a static computation graph. This would be the case for a Tree-RNN or a recursive network, for example, or generally for any layer that manipulates tensors using Python control flow. If False, we assume that the layer can safely be used to generate a static computation graph.
- l2_batch_regularization_factor: float, optional
Factor for L2 regularization of the embeddings vectors (from the current batch only) by default 0.0
**kwargs: Forwarded Keras Layer parameters
- __init__(dim: int, *col_schemas: merlin.schema.schema.ColumnSchema, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None, sequence_combiner: Optional[Union[str, keras.engine.base_layer.Layer]] = None, trainable=True, name=None, dtype=None, dynamic=False, table=None, l2_batch_regularization_factor=0.0, **kwargs)[source]#
Create an EmbeddingTable.
Methods
__init__
(dim, *col_schemas[, ...])Create an EmbeddingTable.
add_feature
(col_schema)Add a feature to the table.
add_loss
(losses, **kwargs)Add loss tensor(s), potentially dependent on layer inputs.
add_metric
(value[, name])Adds metric tensor to the layer.
add_update
(updates)Add update op(s), potentially dependent on layer inputs.
add_variable
(*args, **kwargs)Deprecated, do NOT use! Alias for add_weight.
add_weight
([name, shape, dtype, ...])Adds a new variable to the layer.
as_tabular
([name])build
(input_shapes)build_from_config
(config)call
(inputs, **kwargs)- param inputs
Tensors or dictionary of tensors representing the input batch.
call_outputs
(outputs[, training])check_schema
([schema])compute_call_output_shape
(input_shapes)compute_mask
(inputs[, mask])Computes an output mask tensor.
compute_output_shape
(input_shape)compute_output_signature
(input_signature)Compute the output tensor signature of the layer based on the inputs.
connect
(*block[, block_name, context])Connect the block to other blocks sequentially.
connect_branch
(*branches[, add_rest, post, ...])Connect the block to one or multiple branches.
connect_debug_block
([append])Connect the block to a debug block.
connect_with_residual
(block[, activation])Connect the block to other blocks sequentially with a residual connection.
connect_with_shortcut
(block[, ...])Connect the block to other blocks sequentially with a shortcut connection.
copy
()count_params
()Count the total number of scalars composing the weights.
finalize_state
()Finalizes the layers state after updating layer weights.
from_config
(config[, table])from_dataset
(data[, trainable, name, col_schema])Create From pre-trained embeddings from a Dataset or DataFrame.
from_layer
(layer)from_pretrained
(data[, trainable, name, ...])Create From pre-trained embeddings from a Dataset or DataFrame.
get_build_config
()get_input_at
(node_index)Retrieves the input tensor(s) of a layer at a given node.
get_input_mask_at
(node_index)Retrieves the input mask tensor(s) of a layer at a given node.
get_input_shape_at
(node_index)Retrieves the input shape(s) of a layer at a given node.
get_item_ids_from_inputs
(inputs)get_output_at
(node_index)Retrieves the output tensor(s) of a layer at a given node.
get_output_mask_at
(node_index)Retrieves the output mask tensor(s) of a layer at a given node.
get_output_shape_at
(node_index)Retrieves the output shape(s) of a layer at a given node.
get_padding_mask_from_item_id
(inputs[, ...])get_weights
()Returns the current weights of the layer, as NumPy arrays.
parse
(*block)parse_block
(input)prepare
([block, post, aggregation])Transform the inputs of this block.
register_features
(feature_shapes)repeat
([num])Repeat the block num times.
repeat_in_parallel
([num, prefix, names, ...])Repeat the block num times in parallel.
select_by_name
(name)select_by_tag
(tags)Select features in EmbeddingTable by tags.
set_schema
([schema])set_weights
(weights)Sets the weights of the layer, from NumPy arrays.
to_dataset
([gpu])to_df
([gpu])with_name_scope
(method)Decorator to automatically enter the module name scope.
Attributes
REQUIRES_SCHEMA
activity_regularizer
Optional regularizer function for the output of this layer.
compute_dtype
The dtype of the layer's computations.
context
dtype
The dtype of the layer weights.
dtype_policy
The dtype policy associated with this layer.
dynamic
Whether the layer is dynamic (eager-only); set in the constructor.
has_schema
inbound_nodes
Return Functional API nodes upstream of this layer.
input
Retrieves the input tensor(s) of a layer.
input_dim
input_mask
Retrieves the input mask tensor(s) of a layer.
input_shape
Retrieves the input shape(s) of a layer.
input_spec
InputSpec instance(s) describing the input format for this layer.
losses
List of losses added using the add_loss() API.
metrics
List of metrics added using the add_metric() API.
name
Name of the layer (string), set in the constructor.
name_scope
Returns a tf.name_scope instance for this class.
non_trainable_variables
non_trainable_weights
List of all non-trainable weights tracked by this layer.
outbound_nodes
Return Functional API nodes downstream of this layer.
output
Retrieves the output tensor(s) of a layer.
output_mask
Retrieves the output mask tensor(s) of a layer.
output_shape
Retrieves the output shape(s) of a layer.
registry
schema
stateful
submodules
Sequence of all sub-modules.
supports_masking
Whether this layer supports computing a mask using compute_mask.
table_name
trainable
trainable_variables
trainable_weights
List of all trainable weights tracked by this layer.
updates
variable_dtype
Alias of Layer.dtype, the dtype of the weights.
variables
Returns the list of all layer variables/weights.
weights
Returns the list of all layer variables/weights.
- select_by_tag(tags: Union[merlin.schema.tags.Tags, Sequence[merlin.schema.tags.Tags]]) Optional[merlin.models.tf.inputs.embedding.EmbeddingTable] [source]#
Select features in EmbeddingTable by tags.
Since an EmbeddingTable can be a shared-embedding table, this method filters the schema for features that match the tags.
If none of the features match the tags, it will return None.
- Parameters
tags (Union[Tags, Sequence[Tags]]) – A list of tags.
- Return type
An EmbeddingTable if the tags match. If no features match, it returns None.
- classmethod from_pretrained(data: Union[merlin.io.dataset.Dataset, pandas.core.frame.DataFrame], trainable=True, name=None, col_schema=None, **kwargs) merlin.models.tf.inputs.embedding.EmbeddingTable [source]#
Create From pre-trained embeddings from a Dataset or DataFrame. :param data: A dataset containing the pre-trained embedding weights :type data: Union[Dataset, DataFrameType] :param trainable: Whether the layer should be trained or not. :type trainable: bool :param name: The name of the layer. :type name: str
- classmethod from_dataset(data: Union[merlin.io.dataset.Dataset, pandas.core.frame.DataFrame], trainable=True, name=None, col_schema=None, **kwargs) merlin.models.tf.inputs.embedding.EmbeddingTable [source]#
Create From pre-trained embeddings from a Dataset or DataFrame. :param data: A dataset containing the pre-trained embedding weights :type data: Union[Dataset, DataFrameType] :param trainable: Whether the layer should be trained or not. :type trainable: bool :param name: The name of the layer. :type name: str
- to_dataset(gpu=None) merlin.io.dataset.Dataset [source]#
- call(inputs: Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]], **kwargs) Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]] [source]#
- Parameters
inputs (Union[tf.Tensor, tf.RaggedTensor, tf.SparseTensor]) – Tensors or dictionary of tensors representing the input batch.
- Return type
A tensor or dict of tensors corresponding to the embeddings for inputs