transformers4rec.torch.model package

Submodules

transformers4rec.torch.model.head module

transformers4rec.torch.model.model module

transformers4rec.torch.model.prediction_task module

class transformers4rec.torch.model.prediction_task.BinaryClassificationPrepareBlock[source]

Bases: transformers4rec.torch.block.base.BuildableBlock

Prepares the output layer of the binary classification prediction task. The output layer is a SequentialBlock of a torch linear layer followed by a sigmoid activation and a squeeze operation.

build(input_size)transformers4rec.torch.block.base.SequentialBlock[source]

Builds the output layer of binary classification based on the input_size.

Parameters

input_size (Tuple[int]) – The size of the input tensor, specifically the last dimension is used for setting the input dimension of the linear layer.

Returns

A SequentialBlock consisting of a linear layer (with input dimension equal to the last dimension of input_size), a sigmoid activation, and a squeeze operation.

Return type

SequentialBlock

class transformers4rec.torch.model.prediction_task.BinaryClassificationTask(target_name: Optional[str] = None, task_name: Optional[str] = None, task_block: Optional[Union[transformers4rec.torch.block.base.BlockBase, transformers4rec.torch.block.base.BuildableBlock]] = None, loss=BCELoss(), metrics=(BinaryPrecision(), BinaryRecall(), BinaryAccuracy()), summary_type='first')[source]

Bases: transformers4rec.torch.model.base.PredictionTask

Returns a PredictionTask for binary classification.

Example usage:

# Define the input module to process the tabular input features.
input_module = tr.TabularSequenceFeatures.from_schema(
    schema,
    max_sequence_length=max_sequence_length,
    continuous_projection=d_model,
    aggregation="concat",
    masking=None,
)

# Define XLNetConfig class and set default parameters for HF XLNet config.
transformer_config = tr.XLNetConfig.build(
    d_model=d_model, n_head=4, n_layer=2, total_seq_length=max_sequence_length
)

# Define the model block including: inputs, masking, projection and transformer block.
body = tr.SequentialBlock(
    input_module,
    tr.MLPBlock([64]),
    tr.TransformerBlock(
        transformer_config,
        masking=input_module.masking
    )
)

# Define a head with BinaryClassificationTask.
head = tr.Head(
    body,
    tr.BinaryClassificationTask(
        "click",
        summary_type="mean",
        metrics=[
            tm.Precision(task='binary'),
            tm.Recall(task='binary'),
            tm.Accuracy(task='binary'),
            tm.F1Score(task='binary')
        ]
    ),
    inputs=input_module,
)

# Get the end-to-end Model class.
model = tr.Model(head)
Parameters
  • target_name (Optional[str] = None) – Specifies the variable name that represents the positive and negative values.

  • task_name (Optional[str] = None) – Specifies the name of the prediction task. If this parameter is not specified, a name is automatically constructed based on target_name and the Python class name of the model.

  • task_block (Optional[BlockType] = None) – Specifies a module to transform the input tensor before computing predictions.

  • loss (torch.nn.Module) – Specifies the loss function for the task. The default class is torch.nn.BCELoss.

  • metrics (Tuple[torch.nn.Module, ..]) – Specifies the metrics to calculate during training and evaluation. The default metrics are Precision, Recall, and Accuracy.

  • summary_type (str) –

    Summarizes a sequence into a single tensor. Accepted values are:

    • last – Take the last token hidden state (like XLNet)

    • first – Take the first token hidden state (like Bert)

    • mean – Take the mean of all tokens hidden states

    • cls_index – Supply a Tensor of classification token position (GPT/GPT-2)

    • attn – Not implemented now, use multi-head attention

DEFAULT_LOSS = BCELoss()
DEFAULT_METRICS = (BinaryPrecision(), BinaryRecall(), BinaryAccuracy())
training: bool
class transformers4rec.torch.model.prediction_task.RegressionPrepareBlock[source]

Bases: transformers4rec.torch.block.base.BuildableBlock

Prepares the output layer of the regression prediction task. The output layer is a SequentialBlock of a torch linear layer followed by a squeeze operation.

build(input_size)transformers4rec.torch.block.base.SequentialBlock[source]

Builds the output layer of regression based on the input_size.

Parameters

input_size (Tuple[int]) – The size of the input tensor, specifically the last dimension is used for setting the input dimension of the linear layer.

Returns

A SequentialBlock consisting of a linear layer (with input dimension equal to the last dimension of input_size), and a squeeze operation.

Return type

SequentialBlock

class transformers4rec.torch.model.prediction_task.RegressionTask(target_name: Optional[str] = None, task_name: Optional[str] = None, task_block: Optional[Union[transformers4rec.torch.block.base.BlockBase, transformers4rec.torch.block.base.BuildableBlock]] = None, loss=MSELoss(), metrics=(MeanSquaredError()), summary_type='first')[source]

Bases: transformers4rec.torch.model.base.PredictionTask

Returns a PredictionTask for regression.

Example usage:

# Define the input module to process the tabular input features.
input_module = tr.TabularSequenceFeatures.from_schema(
    schema,
    max_sequence_length=max_sequence_length,
    continuous_projection=d_model,
    aggregation="concat",
    masking=None,
)

# Define XLNetConfig class and set default parameters for HF XLNet config.
transformer_config = tr.XLNetConfig.build(
    d_model=d_model, n_head=4, n_layer=2, total_seq_length=max_sequence_length
)

# Define the model block including: inputs, projection and transformer block.
body = tr.SequentialBlock(
    input_module,
    tr.MLPBlock([64]),
    tr.TransformerBlock(
        transformer_config,
    )
)

# Define a head with BinaryClassificationTask.
head = tr.Head(
    body,
    tr.RegressionTask(
        "watch_time",
        summary_type="mean",
        metrics=[tm.regression.MeanSquaredError()]
    ),
    inputs=input_module,
)

# Get the end-to-end Model class.
model = tr.Model(head)
Parameters
  • target_name (Optional[str]) – Specifies the variable name that represents the continuous value to predict. By default None

  • task_name (Optional[str]) – Specifies the name of the prediction task. If this parameter is not specified, a name is automatically constructed based on target_name and the Python class name of the model. By default None

  • task_block (Optional[BlockType] = None) – Specifies a module to transform the input tensor before computing predictions.

  • loss (torch.nn.Module) – Specifies the loss function for the task. The default class is torch.nn.MSELoss.

  • metrics (Tuple[torch.nn.Module, ..]) – Specifies the metrics to calculate during training and evaluation. The default metric is MeanSquaredError.

  • summary_type (str) –

    Summarizes a sequence into a single tensor. Accepted values are:

    • last – Take the last token hidden state (like XLNet)

    • first – Take the first token hidden state (like Bert)

    • mean – Take the mean of all tokens hidden states

    • cls_index – Supply a Tensor of classification token position (GPT/GPT-2)

    • attn – Not implemented now, use multi-head attention

DEFAULT_LOSS = MSELoss()
DEFAULT_METRICS = (MeanSquaredError(),)
training: bool
class transformers4rec.torch.model.prediction_task.NextItemPredictionTask(loss: torch.nn.modules.module.Module = CrossEntropyLoss(), metrics: Iterable[torchmetrics.metric.Metric] = (NDCGAt(), AvgPrecisionAt(), RecallAt()), task_block: Optional[Union[transformers4rec.torch.block.base.BlockBase, transformers4rec.torch.block.base.BuildableBlock]] = None, task_name: str = 'next-item', weight_tying: bool = False, softmax_temperature: float = 1, padding_idx: int = 0, target_dim: Optional[int] = None, sampled_softmax: Optional[bool] = False, max_n_samples: Optional[int] = 100)[source]

Bases: transformers4rec.torch.model.base.PredictionTask

This block performs item prediction task for session and sequential-based models. It requires a body containing a masking schema to use for training and target generation. For the supported masking schemes, please refers to: https://nvidia-merlin.github.io/Transformers4Rec/stable/model_definition.html#sequence-masking

Parameters
  • loss (torch.nn.Module) – Loss function to use. Defaults to NLLLos.

  • metrics (Iterable[torchmetrics.Metric]) – List of ranking metrics to use for evaluation.

  • task_block – Module to transform input tensor before computing predictions.

  • task_name (str, optional) – Name of the prediction task, if not provided a name will be automatically constructed based on the target-name & class-name.

  • weight_tying (bool) – The item id embedding table weights are shared with the prediction network layer.

  • softmax_temperature (float) – Softmax temperature, used to reduce model overconfidence, so that softmax(logits / T). Value 1.0 reduces to regular softmax.

  • padding_idx (int) – pad token id.

  • target_dim (int) – vocabulary size of item ids

  • sampled_softmax (Optional[bool]) – Enables sampled softmax. By default False

  • max_n_samples (Optional[int]) – Number of samples for sampled softmax. By default 100

DEFAULT_METRICS = (NDCGAt(), AvgPrecisionAt(), RecallAt())
build(body, input_size, device=None, inputs=None, task_block=None, pre=None)[source]

Build method, this is called by the Head.

forward(inputs: torch.Tensor, targets=None, training=False, testing=False, top_k=None, **kwargs)[source]
remove_pad_3d(inp_tensor, non_pad_mask)[source]
calculate_metrics(predictions, targets)Dict[str, torch.Tensor][source]
compute_metrics()[source]
training: bool
class transformers4rec.torch.model.prediction_task.NextItemPredictionPrepareBlock(target_dim: int, weight_tying: bool = False, item_embedding_table: Optional[torch.nn.modules.module.Module] = None, softmax_temperature: float = 0, sampled_softmax: Optional[bool] = False, max_n_samples: Optional[int] = 100, min_id: Optional[int] = 0)[source]

Bases: transformers4rec.torch.block.base.BuildableBlock

Prepares the output layer of the next item prediction task. The output layer is a an instance of _NextItemPredictionTask class.

Parameters
  • target_dim (int) – The output dimension for next-item predictions.

  • weight_tying (bool, optional) – If true, ties the weights of the prediction layer and the item embedding layer. By default False.

  • item_embedding_table (torch.nn.Module, optional) – The module containing the item embedding table. By default None.

  • softmax_temperature (float, optional) – The temperature to be applied to the softmax function. Defaults to 0.

  • sampled_softmax (bool, optional) – If true, sampled softmax is used for approximating the full softmax function. By default False.

  • max_n_samples (int, optional) – The maximum number of samples when using sampled softmax. By default 100.

  • min_id (int, optional) – The minimum value of the range for the log-uniform sampling. By default 0.

build(input_size)transformers4rec.torch.block.base.Block[source]

Builds the output layer of next-item prediction based on the input_size.

Parameters

input_size (Tuple[int]) – The size of the input tensor, specifically the last dimension is used for setting the input dimension of the output layer.

Returns

an instance of _NextItemPredictionTask

Return type

Block[_NextItemPredictionTask]

class transformers4rec.torch.model.prediction_task.LogUniformSampler(max_n_samples: int, max_id: int, min_id: Optional[int] = 0, unique_sampling: bool = True, n_samples_multiplier_before_unique: int = 2)[source]

Bases: torch.nn.modules.module.Module

get_log_uniform_distr(max_id: int, min_id: int = 0)torch.Tensor[source]

Approximates the items frequency distribution with log-uniform probability distribution with P(class) = (log(class + 2) - log(class + 1)) / log(max_id + 1). It assumes item ids are sorted decreasingly by their frequency.

Parameters

max_id (int) – Maximum discrete value for sampling (e.g. cardinality of the item id)

Returns

Returns the log uniform probability distribution

Return type

torch.Tensor

get_unique_sampling_distr(dist, n_sample)[source]

Returns the probability that each item is sampled at least once given the specified number of trials. This is meant to be used when self.unique_sampling == True. That probability can be approximated by by 1 - (1 - p)^n and we use a numerically stable version: -expm1(num_tries * log1p(-p))

sample(labels: torch.Tensor)[source]

Sample negative samples and calculate their probabilities.

If unique_sampling==True, then only unique sampled items will be returned. The actual # samples will vary from run to run if unique_sampling==True, as sampling without replacement (torch.multinomial(…, replacement=False)) is slow, so we use torch.multinomial(…, replacement=True).unique() which doesn’t guarantee the same number of unique sampled items. You can try to increase n_samples_multiplier_before_unique to increase the chances to have more unique samples in that case.

Parameters

labels (torch.Tensor, dtype=torch.long, shape=(batch_size,)) – The input labels for which negative samples should be generated.

Returns

  • neg_samples (torch.Tensor, dtype=torch.long, shape=(n_samples,)) – The unique negative samples drawn from the log-uniform distribution.

  • true_probs (torch.Tensor, dtype=torch.float32, shape=(batch_size,)) – The probabilities of the input labels according to the log-uniform distribution (depends on self.unique_sampling choice).

  • samp_log_probs (torch.Tensor, dtype=torch.float32, shape=(n_samples,)) – The probabilities of the sampled negatives according to the log-uniform distribution (depends on self.unique_sampling choice).

forward(labels)[source]
training: bool

Module contents