merlin.models.tf.EmbeddingTable#

class merlin.models.tf.EmbeddingTable(*args, **kwargs)[source]#

Bases: merlin.models.tf.inputs.embedding.EmbeddingTableBase

Embedding table that is backed by a standard Keras Embedding Layer. It accepts as input features for lookup tf.Tensor, tf.RaggedTensor, and tf.SparseTensor which might be 2D (batch_size, 1) for scalars or 3d (batch_size, seq_length, 1) for sequential features

dim: Dimension of the dense embedding. col_schema: ColumnSchema

Schema of the column. This is used to infer the cardinality.

embeddings_initializer: Initializer for the embeddings

matrix (see keras.initializers).

embeddings_regularizer: Regularizer function applied to

the embeddings matrix (see keras.regularizers).

embeddings_constraint: Constraint function applied to

the embeddings matrix (see keras.constraints).

mask_zero: Boolean, whether or not the input value 0 is a special “padding”

value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True, then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1).

input_length: Length of input sequences, when it is constant.

This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed).

combiner: A string specifying how to combine embedding results for each

entry (“mean”, “sqrtn” and “sum” are supported) or a layer. Default is None (no combiner used)

trainable: Boolean, whether the layer’s variables should be trainable. name: String name of the layer. dtype: The dtype of the layer’s computations and weights. Can also be a

tf.keras.mixed_precision.Policy, which allows the computation and weight dtype to differ. Default of None means to use tf.keras.mixed_precision.global_policy(), which is a float32 policy unless set to different value.

dynamic: Set this to True if your layer should only be run eagerly, and

should not be used to generate a static computation graph. This would be the case for a Tree-RNN or a recursive network, for example, or generally for any layer that manipulates tensors using Python control flow. If False, we assume that the layer can safely be used to generate a static computation graph.

l2_batch_regularization_factor: float, optional

Factor for L2 regularization of the embeddings vectors (from the current batch only) by default 0.0

**kwargs: Forwarded Keras Layer parameters

__init__(dim: int, *col_schemas: merlin.schema.schema.ColumnSchema, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None, sequence_combiner: Optional[Union[str, keras.engine.base_layer.Layer]] = None, trainable=True, name=None, dtype=None, dynamic=False, table=None, l2_batch_regularization_factor=0.0, weights=None, **kwargs)[source]#

Create an EmbeddingTable.

Methods

__init__(dim, *col_schemas[, ...])

Create an EmbeddingTable.

add_feature(col_schema)

Add a feature to the table.

add_loss(losses, **kwargs)

Add loss tensor(s), potentially dependent on layer inputs.

add_metric(value[, name])

Adds metric tensor to the layer.

add_update(updates)

Add update op(s), potentially dependent on layer inputs.

add_variable(*args, **kwargs)

Deprecated, do NOT use! Alias for add_weight.

add_weight([name, shape, dtype, ...])

Adds a new variable to the layer.

as_tabular([name])

build(input_shapes)

build_from_config(config)

call(inputs, **kwargs)

param inputs

Tensors or dictionary of tensors representing the input batch.

call_outputs(outputs[, training])

check_schema([schema])

compute_call_output_shape(input_shapes)

compute_mask(inputs[, mask])

Computes an output mask tensor.

compute_output_shape(input_shape)

compute_output_signature(input_signature)

Compute the output tensor signature of the layer based on the inputs.

connect(*block[, block_name, context])

Connect the block to other blocks sequentially.

connect_branch(*branches[, add_rest, post, ...])

Connect the block to one or multiple branches.

connect_debug_block([append])

Connect the block to a debug block.

connect_with_residual(block[, activation])

Connect the block to other blocks sequentially with a residual connection.

connect_with_shortcut(block[, ...])

Connect the block to other blocks sequentially with a shortcut connection.

copy()

count_params()

Count the total number of scalars composing the weights.

finalize_state()

Finalizes the layers state after updating layer weights.

from_config(config[, table])

from_dataset(data[, trainable, name, col_schema])

Create From pre-trained embeddings from a Dataset or DataFrame.

from_layer(layer)

from_pretrained(data[, trainable, name, ...])

Create From pre-trained embeddings from a Dataset or DataFrame.

get_build_config()

get_config()

get_input_at(node_index)

Retrieves the input tensor(s) of a layer at a given node.

get_input_mask_at(node_index)

Retrieves the input mask tensor(s) of a layer at a given node.

get_input_shape_at(node_index)

Retrieves the input shape(s) of a layer at a given node.

get_item_ids_from_inputs(inputs)

get_output_at(node_index)

Retrieves the output tensor(s) of a layer at a given node.

get_output_mask_at(node_index)

Retrieves the output mask tensor(s) of a layer at a given node.

get_output_shape_at(node_index)

Retrieves the output shape(s) of a layer at a given node.

get_padding_mask_from_item_id(inputs[, ...])

get_weights()

Returns the current weights of the layer, as NumPy arrays.

parse(*block)

parse_block(input)

prepare([block, post, aggregation])

Transform the inputs of this block.

register_features(feature_shapes)

repeat([num])

Repeat the block num times.

repeat_in_parallel([num, prefix, names, ...])

Repeat the block num times in parallel.

select_by_name(name)

select_by_tag(tags)

Select features in EmbeddingTable by tags.

set_schema([schema])

set_weights(weights)

Sets the weights of the layer, from NumPy arrays.

to_dataset([gpu])

to_df([gpu])

with_name_scope(method)

Decorator to automatically enter the module name scope.

Attributes

REQUIRES_SCHEMA

activity_regularizer

Optional regularizer function for the output of this layer.

compute_dtype

The dtype of the layer's computations.

context

dtype

The dtype of the layer weights.

dtype_policy

The dtype policy associated with this layer.

dynamic

Whether the layer is dynamic (eager-only); set in the constructor.

has_schema

inbound_nodes

Return Functional API nodes upstream of this layer.

input

Retrieves the input tensor(s) of a layer.

input_dim

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape(s) of a layer.

input_spec

InputSpec instance(s) describing the input format for this layer.

losses

List of losses added using the add_loss() API.

metrics

List of metrics added using the add_metric() API.

name

Name of the layer (string), set in the constructor.

name_scope

Returns a tf.name_scope instance for this class.

non_trainable_variables

non_trainable_weights

List of all non-trainable weights tracked by this layer.

outbound_nodes

Return Functional API nodes downstream of this layer.

output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape(s) of a layer.

registry

schema

stateful

submodules

Sequence of all sub-modules.

supports_masking

Whether this layer supports computing a mask using compute_mask.

table_name

trainable

trainable_variables

trainable_weights

List of all trainable weights tracked by this layer.

updates

variable_dtype

Alias of Layer.dtype, the dtype of the weights.

variables

Returns the list of all layer variables/weights.

weights

Returns the list of all layer variables/weights.

select_by_tag(tags: Union[merlin.schema.tags.Tags, Sequence[merlin.schema.tags.Tags]]) Optional[merlin.models.tf.inputs.embedding.EmbeddingTable][source]#

Select features in EmbeddingTable by tags.

Since an EmbeddingTable can be a shared-embedding table, this method filters the schema for features that match the tags.

If none of the features match the tags, it will return None.

Parameters

tags (Union[Tags, Sequence[Tags]]) – A list of tags.

Return type

An EmbeddingTable if the tags match. If no features match, it returns None.

classmethod from_pretrained(data: Union[merlin.io.dataset.Dataset, pandas.core.frame.DataFrame], trainable=True, name=None, col_schema=None, **kwargs) merlin.models.tf.inputs.embedding.EmbeddingTable[source]#

Create From pre-trained embeddings from a Dataset or DataFrame. :param data: A dataset containing the pre-trained embedding weights :type data: Union[Dataset, DataFrameType] :param trainable: Whether the layer should be trained or not. :type trainable: bool :param name: The name of the layer. :type name: str

classmethod from_dataset(data: Union[merlin.io.dataset.Dataset, pandas.core.frame.DataFrame], trainable=True, name=None, col_schema=None, **kwargs) merlin.models.tf.inputs.embedding.EmbeddingTable[source]#

Create From pre-trained embeddings from a Dataset or DataFrame. :param data: A dataset containing the pre-trained embedding weights :type data: Union[Dataset, DataFrameType] :param trainable: Whether the layer should be trained or not. :type trainable: bool :param name: The name of the layer. :type name: str

to_dataset(gpu=None) merlin.io.dataset.Dataset[source]#
to_df(gpu=None)[source]#
build(input_shapes)[source]#
call(inputs: Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]], **kwargs) Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]][source]#
Parameters

inputs (Union[tf.Tensor, tf.RaggedTensor, tf.SparseTensor]) – Tensors or dictionary of tensors representing the input batch.

Return type

A tensor or dict of tensors corresponding to the embeddings for inputs

compute_output_shape(input_shape: Union[tensorflow.python.framework.tensor_shape.TensorShape, Dict[str, tensorflow.python.framework.tensor_shape.TensorShape]]) Union[tensorflow.python.framework.tensor_shape.TensorShape, Dict[str, tensorflow.python.framework.tensor_shape.TensorShape]][source]#
compute_call_output_shape(input_shapes)[source]#
classmethod from_config(config, table=None)[source]#
get_config()[source]#