merlin.models.tf.SequenceMaskRandom#

class merlin.models.tf.SequenceMaskRandom(*args, **kwargs)[source]#

Bases: merlin.models.tf.transforms.sequence.SequenceTargetAsInput

This block implements the Masked Language Modeling (MLM) training approach introduced in BERT (NLP) and later adapted to RecSys by BERT4Rec [1]. Given an input tf.RaggedTensor with sequences of embeddings and the corresponding sequence of item ids, some positions are randomly selected (masked) to be the targets for prediction. The targets are output being the same as the input ids sequence. The target masks are returned by using Keras Masking (._keras_mask), which is set by the compute_mask() method.

Note: The SequenceMaskRandom is meant to be used as a pre of model.fit(), e.g. model.fit(…, pre=SequenceMaskRandom(…)). Note: Typically during model.evaluate() you want to evaluate the model to predict the last position of the sequence, as it mimics the next-item predicion task. In that case, you should use model.evaluate(…, pre=SequenceMaskLast(…)) instead of SequenceMaskRandom(…).

References

1: Sun, Fei, et al. “BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer.” Proceedings of the 28th ACM international conference on information and knowledge management. 2019.

Parameters

schema (Schema) – The input schema, that will be used to discover the name of the item id column
target (Union[str, Tags, ColumnSchema]) – The sequential input column that will be used to extract the target
masking_prob (float, optional) – Probability of an item to be selected (masked) as a label of the given sequence. Note: We enforce that at least one item is masked for each sequence, so that it is useful for training, by default 0.2

__init__(schema: merlin.schema.schema.Schema, target: Union[str, merlin.schema.tags.Tags, merlin.schema.schema.ColumnSchema], masking_prob: float = 0.2, **kwargs)[source]#

Methods

`__init__`(schema, target[, masking_prob])
`add_loss`(losses, **kwargs)	Add loss tensor(s), potentially dependent on layer inputs.
`add_metric`(value[, name])	Adds metric tensor to the layer.
`add_update`(updates)	Add update op(s), potentially dependent on layer inputs.
`add_variable`(args, *kwargs)	Deprecated, do NOT use! Alias for add_weight.
`add_weight`([name, shape, dtype, ...])	Adds a new variable to the layer.
`apply_to_all`(inputs[, columns_to_filter])
`as_tabular`([name])
`build`(input_shapes)
`build_from_config`(config)
`calculate_batch_size_from_input_shapes`(...)
`call`(inputs[, targets, training, testing])
`call_outputs`(outputs[, training])
`check_schema`([schema])
`compute_call_output_shape`(input_shapes)
`compute_mask`(inputs[, mask])	Selects (masks) some positions of the targets to be predicted.
`compute_output_shape`(input_shape)
`compute_output_signature`(input_signature)	Compute the output tensor signature of the layer based on the inputs.
`configure_for_test`()	Method called by the model.evaluate() to check that the masking_pre set in the TransformerBlock is aligned with the evaluation strategy of SequenceMaskRandom
`configure_for_train`()	Method called by the model.fit() to set the specialized masking_post and masking_pre needed by the TransformerBlock to align with the SequencePredictNext outputs.
`connect`(*block[, block_name, context])	Connect the block to other blocks sequentially.
`connect_branch`(*branches[, add_rest, post, ...])	Connect the block to one or multiple branches.
`connect_debug_block`([append])	Connect the block to a debug block.
`connect_with_residual`(block[, activation])	Connect the block to other blocks sequentially with a residual connection.
`connect_with_shortcut`(block[, ...])	Connect the block to other blocks sequentially with a shortcut connection.
`copy`()
`count_params`()	Count the total number of scalars composing the weights.
`finalize_state`()	Finalizes the layers state after updating layer weights.
`from_config`(config)
`from_features`(features[, pre, post, ...])	Initializes a TabularLayer instance where the contents of features will be filtered out
`from_layer`(layer)
`from_schema`(schema[, tags, allow_none])	Instantiate a TabularLayer instance from a DatasetSchema.
`get_build_config`()
`get_config`()
`get_input_at`(node_index)	Retrieves the input tensor(s) of a layer at a given node.
`get_input_mask_at`(node_index)	Retrieves the input mask tensor(s) of a layer at a given node.
`get_input_shape_at`(node_index)	Retrieves the input shape(s) of a layer at a given node.
`get_item_ids_from_inputs`(inputs)
`get_output_at`(node_index)	Retrieves the output tensor(s) of a layer at a given node.
`get_output_mask_at`(node_index)	Retrieves the output mask tensor(s) of a layer at a given node.
`get_output_shape_at`(node_index)	Retrieves the output shape(s) of a layer at a given node.
`get_padding_mask_from_item_id`(inputs[, ...])
`get_weights`()	Returns the current weights of the layer, as NumPy arrays.
`parse`(*block)
`parse_block`(input)
`post_call`(inputs[, transformations, ...])	Method that's typically called after the forward method for post-processing.
`pre_call`(inputs[, transformations])	Method that's typically called before the forward method for pre-processing.
`prepare`([block, post, aggregation])	Transform the inputs of this block.
`register_features`(feature_shapes)
`repeat`([num])	Repeat the block num times.
`repeat_in_parallel`([num, prefix, names, ...])	Repeat the block num times in parallel.
`repr_add`()
`repr_extra`()
`repr_ignore`()
`select_by_name`(name)
`select_by_tag`(tags)
`set_aggregation`(value)	param value
`set_post`(value)
`set_pre`(value)
`set_schema`([schema])
`set_weights`(weights)	Sets the weights of the layer, from NumPy arrays.
`super`()
`with_name_scope`(method)	Decorator to automatically enter the module name scope.

Attributes

`REQUIRES_SCHEMA`
`activity_regularizer`	Optional regularizer function for the output of this layer.
`aggregation`	rtype: TabularAggregation, optional
`compute_dtype`	The dtype of the layer's computations.
`context`
`dtype`	The dtype of the layer weights.
`dtype_policy`	The dtype policy associated with this layer.
`dynamic`	Whether the layer is dynamic (eager-only); set in the constructor.
`has_schema`
`inbound_nodes`	Return Functional API nodes upstream of this layer.
`input`	Retrieves the input tensor(s) of a layer.
`input_mask`	Retrieves the input mask tensor(s) of a layer.
`input_shape`	Retrieves the input shape(s) of a layer.
`input_spec`	InputSpec instance(s) describing the input format for this layer.
`is_input`
`is_tabular`
`losses`	List of losses added using the add_loss() API.
`metrics`	List of metrics added using the add_metric() API.
`name`	Name of the layer (string), set in the constructor.
`name_scope`	Returns a tf.name_scope instance for this class.
`non_trainable_variables`
`non_trainable_weights`	List of all non-trainable weights tracked by this layer.
`outbound_nodes`	Return Functional API nodes downstream of this layer.
`output`	Retrieves the output tensor(s) of a layer.
`output_mask`	Retrieves the output mask tensor(s) of a layer.
`output_shape`	Retrieves the output shape(s) of a layer.
`post`	rtype: SequentialTabularTransformations, optional
`pre`	rtype: SequentialTabularTransformations, optional
`registry`
`schema`
`stateful`
`submodules`	Sequence of all sub-modules.
`supports_masking`	Whether this layer supports computing a mask using compute_mask.
`trainable`
`trainable_variables`
`trainable_weights`	List of all trainable weights tracked by this layer.
`updates`
`variable_dtype`	Alias of Layer.dtype, the dtype of the weights.
`variables`	Returns the list of all layer variables/weights.
`weights`	Returns the list of all layer variables/weights.

compute_mask(inputs, mask=None)[source]#: Selects (masks) some positions of the targets to be predicted. This method is called by Keras after call() and returns the targets mask that will be assigned to the input tensors and targets, being accessible by tensor._keras_mask

get_config()[source]#

classmethod from_config(config)[source]#

configure_for_train()[source]#: Method called by the model.fit() to set the specialized masking_post and masking_pre needed by the TransformerBlock to align with the SequencePredictNext outputs.

configure_for_test()[source]#: Method called by the model.evaluate() to check that the masking_pre set in the TransformerBlock is aligned with the evaluation strategy of SequenceMaskRandom