merlin.models.tf.MaskedLanguageModeling

class merlin.models.tf.MaskedLanguageModeling(*args, **kwargs)[source]

Bases: merlin.models.tf.blocks.core.masking.MaskingBlock

In Masked Language Modeling (mlm) you randomly select some positions of the sequence to be predicted, which are masked. During training, the Transformer layer is allowed to use positions on the right (future info). During inference, all past items are visible for the Transformer layer, which tries to predict the next item. :param {mask_sequence_parameters}: :param mlm_probability: Probability of an item to be selected (masked) as a label of the given sequence.

p.s. We enforce that at least one item is masked for each sequence, so that the network can learn something with it. Defaults to 0.15

__init__(padding_idx: int = 0, eval_on_last_item_seq_only: bool = True, mlm_probability: float = 0.15, **kwargs)[source]

Methods

__init__([padding_idx, …])

add_features_to_context(feature_shapes)

add_loss(losses, **kwargs)

Add loss tensor(s), potentially dependent on layer inputs.

add_metric(value[, name])

Adds metric tensor to the layer.

add_update(updates)

Add update op(s), potentially dependent on layer inputs.

add_variable(*args, **kwargs)

Deprecated, do NOT use! Alias for add_weight.

add_weight([name, shape, dtype, …])

Adds a new variable to the layer.

apply_mask_to_inputs(inputs, schema)

as_tabular([name])

build(input_shapes)

call(inputs[, training])

call_outputs(outputs[, training])

check_schema([schema])

compute_mask(inputs[, mask])

Computes an output mask tensor.

compute_mask_schema(item_ids[, training])

Compute the mask schema for masked language modeling task the function is based on HuggingFace’s transformers/data/data_collator.py

compute_output_shape(input_shape)

Computes the output shape of the layer.

compute_output_signature(input_signature)

Compute the output tensor signature of the layer based on the inputs.

connect(*block[, block_name, context])

Connect the block to other blocks sequentially.

connect_branch(*branches[, add_rest, post, …])

Connect the block to one or multiple branches.

connect_debug_block([append])

Connect the block to a debug block.

connect_with_residual(block[, activation])

Connect the block to other blocks sequentially with a residual connection.

connect_with_shortcut(block[, …])

Connect the block to other blocks sequentially with a shortcut connection.

copy()

count_params()

Count the total number of scalars composing the weights.

finalize_state()

Finalizes the layers state after updating layer weights.

from_config(config)

Creates a layer from its config.

from_layer(layer)

get_config()

get_input_at(node_index)

Retrieves the input tensor(s) of a layer at a given node.

get_input_mask_at(node_index)

Retrieves the input mask tensor(s) of a layer at a given node.

get_input_shape_at(node_index)

Retrieves the input shape(s) of a layer at a given node.

get_item_ids_from_inputs(inputs)

get_output_at(node_index)

Retrieves the output tensor(s) of a layer at a given node.

get_output_mask_at(node_index)

Retrieves the output mask tensor(s) of a layer at a given node.

get_output_shape_at(node_index)

Retrieves the output shape(s) of a layer at a given node.

get_padding_mask_from_item_id(inputs[, …])

get_weights()

Returns the current weights of the layer, as NumPy arrays.

parse(*block)

parse_block(input)

prepare([block, post, aggregation])

Transform the inputs of this block.

register_features(feature_shapes)

repeat([num])

Repeat the block num times.

repeat_in_parallel([num, prefix, names, …])

Repeat the block num times in parallel.

select_by_name(name)

set_schema([schema])

set_weights(weights)

Sets the weights of the layer, from NumPy arrays.

to_model(schema[, input_block, prediction_tasks])

Wrap the block between inputs & outputs to create a model.

with_name_scope(method)

Decorator to automatically enter the module name scope.

Attributes

REQUIRES_SCHEMA

activity_regularizer

Optional regularizer function for the output of this layer.

compute_dtype

The dtype of the layer’s computations.

context

dtype

The dtype of the layer weights.

dtype_policy

The dtype policy associated with this layer.

dynamic

Whether the layer is dynamic (eager-only); set in the constructor.

has_schema

inbound_nodes

Return Functional API nodes upstream of this layer.

input

Retrieves the input tensor(s) of a layer.

input_mask

Retrieves the input mask tensor(s) of a layer.

input_shape

Retrieves the input shape(s) of a layer.

input_spec

InputSpec instance(s) describing the input format for this layer.

losses

List of losses added using the add_loss() API.

metrics

List of metrics added using the add_metric() API.

name

Name of the layer (string), set in the constructor.

name_scope

Returns a tf.name_scope instance for this class.

non_trainable_variables

non_trainable_weights

List of all non-trainable weights tracked by this layer.

outbound_nodes

Return Functional API nodes downstream of this layer.

output

Retrieves the output tensor(s) of a layer.

output_mask

Retrieves the output mask tensor(s) of a layer.

output_shape

Retrieves the output shape(s) of a layer.

registry

schema

stateful

submodules

Sequence of all sub-modules.

supports_masking

Whether this layer supports computing a mask using compute_mask.

trainable

trainable_variables

trainable_weights

List of all trainable weights tracked by this layer.

updates

variable_dtype

Alias of Layer.dtype, the dtype of the weights.

variables

Returns the list of all layer variables/weights.

weights

Returns the list of all layer variables/weights.

get_config()[source]
compute_mask_schema(item_ids: tensorflow.python.framework.ops.Tensor, training: bool = False)tensorflow.python.framework.ops.Tensor[source]

Compute the mask schema for masked language modeling task the function is based on HuggingFace’s transformers/data/data_collator.py