transformers4rec.tf package
Subpackages
Submodules
transformers4rec.tf.masking module
-
class
transformers4rec.tf.masking.
MaskingInfo
(schema: tensorflow.python.framework.ops.Tensor, targets: tensorflow.python.framework.ops.Tensor)[source] Bases:
object
-
schema
: tensorflow.python.framework.ops.Tensor
-
targets
: tensorflow.python.framework.ops.Tensor
-
-
class
transformers4rec.tf.masking.
MaskSequence
(*args, **kwargs)[source] Bases:
keras.engine.base_layer.Layer
Base class to prepare masked items inputs/labels for language modeling tasks.
Transformer architectures can be trained in different ways. Depending of the training method, there is a specific masking schema. The masking schema sets the items to be predicted (labels) and mask (hide) their positions in the sequence so that they are not used by the Transformer layers for prediction.
- We currently provide 4 different masking schemes out of the box:
Causal LM (clm)
Masked LM (mlm)
Permutation LM (plm)
Replacement Token Detection (rtd)
This class can be extended to add different a masking scheme.
- Parameters
hidden_size (int) – The hidden dimension of input tensors, needed to initialize trainable vector of masked positions.
padding_idx (int, default = 0) – Index of padding item used for getting batch of sequences with the same length
eval_on_last_item_seq_only (bool, default = True) – Predict only last item during evaluation
-
compute_masked_targets
(item_ids: tensorflow.python.framework.ops.Tensor, training=False) → transformers4rec.tf.masking.MaskingInfo[source] Method to prepare masked labels based on the sequence of item ids. It returns The true labels of masked positions and the related boolean mask. And the attributes of the class mask_schema and masked_targets are updated to be re-used in other modules.
- Parameters
item_ids (tf.Tensor) – The sequence of input item ids used for deriving labels of next item prediction task.
training (bool) – Flag to indicate whether we are in Training mode or not. During training, the labels can be any items within the sequence based on the selected masking task. During evaluation, we are predicting the last item in the sequence.
- Returns
- Return type
Tuple[MaskingSchema, MaskedTargets]
-
apply_mask_to_inputs
(inputs: tensorflow.python.framework.ops.Tensor, schema: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source] Control the masked positions in the inputs by replacing the true interaction by a learnable masked embedding.
- Parameters
inputs (tf.Tensor) – The 3-D tensor of interaction embeddings resulting from the ops: TabularFeatures + aggregation + projection(optional)
schema (MaskingSchema) – The boolean mask indicating masked positions.
-
predict_all
(item_ids: tensorflow.python.framework.ops.Tensor) → transformers4rec.tf.masking.MaskingInfo[source] Prepare labels for all next item predictions instead of last-item predictions in a user’s sequence.
- Parameters
item_ids (tf.Tensor) – The sequence of input item ids used for deriving labels of next item prediction task.
- Returns
- Return type
Tuple[MaskingSchema, MaskedTargets]
-
call
(inputs: tensorflow.python.framework.ops.Tensor, item_ids: tensorflow.python.framework.ops.Tensor, training=False) → tensorflow.python.framework.ops.Tensor[source]
-
property
transformer_arguments
Prepare additional arguments to pass to the Transformer forward methods.
-
class
transformers4rec.tf.masking.
CausalLanguageModeling
(*args, **kwargs)[source] Bases:
transformers4rec.tf.masking.MaskSequence
In Causal Language Modeling (clm) you predict the next item based on past positions of the sequence. Future positions are masked.
- Parameters
hidden_size (int) – The hidden dimension of input tensors, needed to initialize trainable vector of masked positions.
padding_idx (int, default = 0) – Index of padding item used for getting batch of sequences with the same length
eval_on_last_item_seq_only (bool, default = True) – Predict only last item during evaluation
train_on_last_item_seq_only (predict only last item during training) –
-
class
transformers4rec.tf.masking.
MaskedLanguageModeling
(*args, **kwargs)[source] Bases:
transformers4rec.tf.masking.MaskSequence
In Masked Language Modeling (mlm) you randomly select some positions of the sequence to be predicted, which are masked. During training, the Transformer layer is allowed to use positions on the right (future info). During inference, all past items are visible for the Transformer layer, which tries to predict the next item.
- Parameters
{mask_sequence_parameters} –
mlm_probability (Optional[float], default = 0.15) – Probability of an item to be selected (masked) as a label of the given sequence. p.s. We enforce that at least one item is masked for each sequence, so that the network can learn something with it.
transformers4rec.tf.ranking_metric module
-
class
transformers4rec.tf.ranking_metric.
RankingMetric
(*args, **kwargs)[source] Bases:
keras.metrics.Metric
Metric wrapper for computing ranking metrics@K for session-based task. :param top_ks: list of cutoffs :type top_ks: list, default [2, 5]) :param labels_onehot: Enable transform the encoded labels to one-hot representation :type labels_onehot: bool
transformers4rec.tf.typing module
Module contents
-
class
transformers4rec.tf.
Schema
(feature: Sequence[merlin_standard_lib.proto.schema_bp.Feature] = <betterproto._PLACEHOLDER object>, sparse_feature: List[merlin_standard_lib.proto.schema_bp.SparseFeature] = <betterproto._PLACEHOLDER object>, weighted_feature: List[merlin_standard_lib.proto.schema_bp.WeightedFeature] = <betterproto._PLACEHOLDER object>, string_domain: List[merlin_standard_lib.proto.schema_bp.StringDomain] = <betterproto._PLACEHOLDER object>, float_domain: List[merlin_standard_lib.proto.schema_bp.FloatDomain] = <betterproto._PLACEHOLDER object>, int_domain: List[merlin_standard_lib.proto.schema_bp.IntDomain] = <betterproto._PLACEHOLDER object>, default_environment: List[str] = <betterproto._PLACEHOLDER object>, annotation: merlin_standard_lib.proto.schema_bp.Annotation = <betterproto._PLACEHOLDER object>, dataset_constraints: merlin_standard_lib.proto.schema_bp.DatasetConstraints = <betterproto._PLACEHOLDER object>, tensor_representation_group: Dict[str, merlin_standard_lib.proto.schema_bp.TensorRepresentationGroup] = <betterproto._PLACEHOLDER object>)[source] Bases:
merlin_standard_lib.proto.schema_bp._Schema
A collection of column schemas for a dataset.
-
feature
: List[merlin_standard_lib.schema.schema.ColumnSchema] = Field(name=None,type=None,default=<betterproto._PLACEHOLDER object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'betterproto': FieldMetadata(number=1, proto_type='message', map_types=None, group=None, wraps=None)}),_field_type=None)
-
classmethod
create
(column_schemas: Optional[Union[List[Union[merlin_standard_lib.schema.schema.ColumnSchema, str]], Dict[str, Union[merlin_standard_lib.schema.schema.ColumnSchema, str]]]] = None, **kwargs)[source]
-
apply
(selector) → merlin_standard_lib.schema.schema.Schema[source]
-
apply_inverse
(selector) → merlin_standard_lib.schema.schema.Schema[source]
-
select_by_type
(to_select) → merlin_standard_lib.schema.schema.Schema[source]
-
remove_by_type
(to_remove) → merlin_standard_lib.schema.schema.Schema[source]
-
select_by_tag
(to_select) → merlin_standard_lib.schema.schema.Schema[source]
-
remove_by_tag
(to_remove) → merlin_standard_lib.schema.schema.Schema[source]
-
select_by_name
(to_select) → merlin_standard_lib.schema.schema.Schema[source]
-
remove_by_name
(to_remove) → merlin_standard_lib.schema.schema.Schema[source]
-
map_column_schemas
(map_fn: Callable[[merlin_standard_lib.schema.schema.ColumnSchema], merlin_standard_lib.schema.schema.ColumnSchema]) → merlin_standard_lib.schema.schema.Schema[source]
-
filter_column_schemas
(filter_fn: Callable[[merlin_standard_lib.schema.schema.ColumnSchema], bool], negate=False) → merlin_standard_lib.schema.schema.Schema[source]
-
property
column_names
-
property
column_schemas
-
property
item_id_column_name
-
from_json
(value: Union[str, bytes]) → merlin_standard_lib.schema.schema.Schema[source]
-
from_proto_text
(path_or_proto_text: str) → merlin_standard_lib.schema.schema.Schema[source]
-
copy
(**kwargs) → merlin_standard_lib.schema.schema.Schema[source]
-
add
(other, allow_overlap=True) → merlin_standard_lib.schema.schema.Schema[source]
-
-
class
transformers4rec.tf.
Tag
(value)[source] Bases:
enum.Enum
An enumeration.
-
CATEGORICAL
= 'categorical'
-
CONTINUOUS
= 'continuous'
-
LIST
= 'list'
-
TEXT
= 'text'
-
TEXT_TOKENIZED
= 'text_tokenized'
-
TIME
= 'time'
-
USER
= 'user'
-
USER_ID
= 'user_id'
-
ITEM
= 'item'
-
ITEM_ID
= 'item_id'
-
SESSION
= 'session'
-
SESSION_ID
= 'session_id'
-
CONTEXT
= 'context'
-
TARGETS
= 'target'
-
BINARY_CLASSIFICATION
= 'binary_classification'
-
MULTI_CLASS_CLASSIFICATION
= 'multi_class'
-
REGRESSION
= 'regression'
-
-
class
transformers4rec.tf.
T4RecTrainingArgumentsTF
(output_dir: str, overwrite_output_dir: bool = False, do_train: bool = False, do_eval: bool = False, do_predict: bool = False, evaluation_strategy: transformers.trainer_utils.IntervalStrategy = 'no', prediction_loss_only: bool = False, per_device_train_batch_size: int = 8, per_device_eval_batch_size: int = 8, per_gpu_train_batch_size: Optional[int] = None, per_gpu_eval_batch_size: Optional[int] = None, gradient_accumulation_steps: int = 1, eval_accumulation_steps: Optional[int] = None, eval_delay: Optional[float] = 0, learning_rate: float = 5e-05, weight_decay: float = 0.0, adam_beta1: float = 0.9, adam_beta2: float = 0.999, adam_epsilon: float = 1e-08, max_grad_norm: float = 1.0, num_train_epochs: float = 3.0, max_steps: int = - 1, lr_scheduler_type: transformers.trainer_utils.SchedulerType = 'linear', warmup_ratio: float = 0.0, warmup_steps: int = 0, log_level: Optional[str] = 'passive', log_level_replica: Optional[str] = 'passive', log_on_each_node: bool = True, logging_dir: Optional[str] = None, logging_strategy: transformers.trainer_utils.IntervalStrategy = 'steps', logging_first_step: bool = False, logging_steps: int = 500, logging_nan_inf_filter: bool = True, save_strategy: transformers.trainer_utils.IntervalStrategy = 'steps', save_steps: int = 500, save_total_limit: Optional[int] = None, save_on_each_node: bool = False, no_cuda: bool = False, seed: int = 42, data_seed: Optional[int] = None, bf16: bool = False, fp16: bool = False, fp16_opt_level: str = 'O1', half_precision_backend: str = 'auto', bf16_full_eval: bool = False, fp16_full_eval: bool = False, tf32: Optional[bool] = None, local_rank: int = - 1, xpu_backend: Optional[str] = None, tpu_num_cores: Optional[int] = None, tpu_metrics_debug: bool = False, debug: str = '', dataloader_drop_last: bool = False, eval_steps: Optional[int] = None, dataloader_num_workers: int = 0, past_index: int = - 1, run_name: Optional[str] = None, disable_tqdm: Optional[bool] = None, remove_unused_columns: Optional[bool] = True, label_names: Optional[List[str]] = None, load_best_model_at_end: Optional[bool] = False, metric_for_best_model: Optional[str] = None, greater_is_better: Optional[bool] = None, ignore_data_skip: bool = False, sharded_ddp: str = '', deepspeed: Optional[str] = None, label_smoothing_factor: float = 0.0, optim: transformers.training_args.OptimizerNames = 'adamw_hf', adafactor: bool = False, group_by_length: bool = False, length_column_name: Optional[str] = 'length', report_to: Optional[List[str]] = None, ddp_find_unused_parameters: Optional[bool] = None, ddp_bucket_cap_mb: Optional[int] = None, dataloader_pin_memory: bool = True, skip_memory_metrics: bool = True, use_legacy_prediction_loop: bool = False, push_to_hub: bool = False, resume_from_checkpoint: Optional[str] = None, hub_model_id: Optional[str] = None, hub_strategy: transformers.trainer_utils.HubStrategy = 'every_save', hub_token: Optional[str] = None, gradient_checkpointing: bool = False, fp16_backend: str = 'auto', push_to_hub_model_id: Optional[str] = None, push_to_hub_organization: Optional[str] = None, push_to_hub_token: Optional[str] = None, mp_parameters: str = '', max_sequence_length: Optional[int] = None, shuffle_buffer_size: int = 0, data_loader_engine: str = 'nvtabular', eval_on_test_set: bool = False, eval_steps_on_train_set: int = 20, predict_top_k: int = 10, learning_rate_num_cosine_cycles_by_epoch: float = 1.25, log_predictions: bool = False, compute_metrics_each_n_steps: int = 1, experiments_group: str = 'default')[source] Bases:
transformers4rec.config.trainer.T4RecTrainingArguments
,transformers.training_args_tf.TFTrainingArguments
Prepare Training arguments for TFTrainer, Inherit arguments from T4RecTrainingArguments and TFTrainingArguments
-
class
transformers4rec.tf.
T4RecConfig
[source] Bases:
object
-
to_torch_model
(input_features, *prediction_task, task_blocks=None, task_weights=None, loss_reduction='mean', **kwargs)[source]
-
to_tf_model
(input_features, *prediction_task, task_blocks=None, task_weights=None, loss_reduction=<function reduce_mean>, **kwargs)[source]
-
property
transformers_config_cls
-
-
class
transformers4rec.tf.
GPT2Config
(vocab_size=50257, n_positions=1024, n_embd=768, n_layer=12, n_head=12, n_inner=None, activation_function='gelu_new', resid_pdrop=0.1, embd_pdrop=0.1, attn_pdrop=0.1, layer_norm_epsilon=1e-05, initializer_range=0.02, summary_type='cls_index', summary_use_proj=True, summary_activation=None, summary_proj_to_labels=True, summary_first_dropout=0.1, scale_attn_weights=True, use_cache=True, bos_token_id=50256, eos_token_id=50256, scale_attn_by_inverse_layer_idx=False, reorder_and_upcast_attn=False, **kwargs)[source] Bases:
transformers4rec.config.transformer.T4RecConfig
,transformers.models.gpt2.configuration_gpt2.GPT2Config
-
class
transformers4rec.tf.
XLNetConfig
(vocab_size=32000, d_model=1024, n_layer=24, n_head=16, d_inner=4096, ff_activation='gelu', untie_r=True, attn_type='bi', initializer_range=0.02, layer_norm_eps=1e-12, dropout=0.1, mem_len=512, reuse_len=None, use_mems_eval=True, use_mems_train=False, bi_data=False, clamp_len=- 1, same_length=False, summary_type='last', summary_use_proj=True, summary_activation='tanh', summary_last_dropout=0.1, start_n_top=5, end_n_top=5, pad_token_id=5, bos_token_id=1, eos_token_id=2, **kwargs)[source] Bases:
transformers4rec.config.transformer.T4RecConfig
,transformers.models.xlnet.configuration_xlnet.XLNetConfig
-
class
transformers4rec.tf.
LongformerConfig
(attention_window: Union[List[int], int] = 512, sep_token_id: int = 2, **kwargs)[source] Bases:
transformers4rec.config.transformer.T4RecConfig
,transformers.models.longformer.configuration_longformer.LongformerConfig
-
class
transformers4rec.tf.
AlbertConfig
(vocab_size=30000, embedding_size=128, hidden_size=4096, num_hidden_layers=12, num_hidden_groups=1, num_attention_heads=64, intermediate_size=16384, inner_group_num=1, hidden_act='gelu_new', hidden_dropout_prob=0, attention_probs_dropout_prob=0, max_position_embeddings=512, type_vocab_size=2, initializer_range=0.02, layer_norm_eps=1e-12, classifier_dropout_prob=0.1, position_embedding_type='absolute', pad_token_id=0, bos_token_id=2, eos_token_id=3, **kwargs)[source] Bases:
transformers4rec.config.transformer.T4RecConfig
,transformers.models.albert.configuration_albert.AlbertConfig
-
class
transformers4rec.tf.
ReformerConfig
(attention_head_size=64, attn_layers=['local', 'lsh', 'local', 'lsh', 'local', 'lsh'], axial_norm_std=1.0, axial_pos_embds=True, axial_pos_shape=[64, 64], axial_pos_embds_dim=[64, 192], chunk_size_lm_head=0, eos_token_id=2, feed_forward_size=512, hash_seed=None, hidden_act='relu', hidden_dropout_prob=0.05, hidden_size=256, initializer_range=0.02, is_decoder=False, layer_norm_eps=1e-12, local_num_chunks_before=1, local_num_chunks_after=0, local_attention_probs_dropout_prob=0.05, local_attn_chunk_length=64, lsh_attn_chunk_length=64, lsh_attention_probs_dropout_prob=0.0, lsh_num_chunks_before=1, lsh_num_chunks_after=0, max_position_embeddings=4096, num_attention_heads=12, num_buckets=None, num_hashes=1, pad_token_id=0, vocab_size=320, tie_word_embeddings=False, use_cache=True, classifier_dropout=None, **kwargs)[source] Bases:
transformers4rec.config.transformer.T4RecConfig
,transformers.models.reformer.configuration_reformer.ReformerConfig
-
class
transformers4rec.tf.
ElectraConfig
(vocab_size=30522, embedding_size=128, hidden_size=256, num_hidden_layers=12, num_attention_heads=4, intermediate_size=1024, hidden_act='gelu', hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, max_position_embeddings=512, type_vocab_size=2, initializer_range=0.02, layer_norm_eps=1e-12, summary_type='first', summary_use_proj=True, summary_activation='gelu', summary_last_dropout=0.1, pad_token_id=0, position_embedding_type='absolute', use_cache=True, classifier_dropout=None, **kwargs)[source] Bases:
transformers4rec.config.transformer.T4RecConfig
,transformers.models.electra.configuration_electra.ElectraConfig
-
class
transformers4rec.tf.
Block
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
-
class
transformers4rec.tf.
SequentialBlock
(*args, **kwargs)[source] Bases:
transformers4rec.tf.block.base.Block
The SequentialLayer represents a sequence of Keras layers. It is a Keras Layer that can be used instead of tf.keras.layers.Sequential, which is actually a Keras Model. In contrast to keras Sequential, this layer can be used as a pure Layer in tf.functions and when exporting SavedModels, without having to pre-declare input and output shapes. In turn, this layer is usable as a preprocessing layer for TF Agents Networks, and can be exported via PolicySaver. Usage:
c = SequentialLayer([layer1, layer2, layer3]) output = c(inputs) # Equivalent to: output = layer3(layer2(layer1(inputs)))
-
property
inputs
-
property
trainable_weights
-
property
non_trainable_weights
-
property
trainable
-
property
losses
-
property
regularizers
-
property
-
class
transformers4rec.tf.
DLRMBlock
(*args, **kwargs)[source] Bases:
transformers4rec.tf.block.base.Block
-
classmethod
from_schema
(schema: merlin_standard_lib.schema.schema.Schema, bottom_mlp: Union[keras.engine.base_layer.Layer, transformers4rec.tf.block.base.Block], top_mlp: Optional[Union[keras.engine.base_layer.Layer, transformers4rec.tf.block.base.Block]] = None, **kwargs)[source]
-
classmethod
-
class
transformers4rec.tf.
TransformerBlock
(*args, **kwargs)[source] Bases:
transformers4rec.tf.block.base.Block
Class to support HF Transformers for session-based and sequential-based recommendation models.
- Parameters
transformer (TransformerBody) – The T4RecConfig, The pre-trained HF model or the custom keras layer TF*MainLayer, related to specific transformer architecture.
masking – Needed when masking is applied on the inputs.
-
TRANSFORMER_TO_PREPARE
: Dict[Type[transformers.modeling_tf_utils.TFPreTrainedModel], Type[transformers4rec.tf.block.transformer.TransformerPrepare]] = {}
-
classmethod
from_registry
(transformer: str, d_model: int, n_head: int, n_layer: int, total_seq_length: int, masking: Optional[transformers4rec.tf.masking.MaskSequence] = None)[source] Load the HF transformer architecture based on its name
- Parameters
transformer (str) – Name of the Transformer to use. Possible values are : [“reformer”, “gtp2”, “longformer”, “electra”, “albert”, “xlnet”]
d_model (int) – size of hidden states for Transformers
n_head – Number of attention heads for Transformers
n_layer (int) – Number of layers for RNNs and Transformers”
total_seq_length (int) – The maximum sequence length
-
class
transformers4rec.tf.
TabularBlock
(*args, **kwargs)[source] Bases:
transformers4rec.tf.block.base.Block
Layer that’s specialized for tabular-data by integrating many often used operations.
Note, when extending this class, typically you want to overwrite the compute_call_output_shape method instead of the normal compute_output_shape. This because a Block can contain pre- and post-processing and the output-shapes are handled automatically in compute_output_shape. The output of compute_call_output_shape should be the shape that’s outputted by the call-method.
- Parameters
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).
aggregation (Union[str, TabularAggregation], optional) –
Aggregation to apply after processing the call-method to output a single Tensor.
Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.
name (Optional[str]) – Name of the layer.
-
classmethod
from_schema
(schema: merlin_standard_lib.schema.schema.Schema, tags=None, **kwargs) → Optional[transformers4rec.tf.tabular.base.TabularBlock][source] Instantiate a TabularLayer instance from a DatasetSchema.
- Parameters
schema –
tags –
kwargs –
- Returns
- Return type
Optional[TabularModule]
-
classmethod
from_features
(features: List[str], pre: Optional[Union[str, transformers4rec.tf.tabular.base.TabularTransformation, List[Union[str, transformers4rec.tf.tabular.base.TabularTransformation]]]] = None, post: Optional[Union[str, transformers4rec.tf.tabular.base.TabularTransformation, List[Union[str, transformers4rec.tf.tabular.base.TabularTransformation]]]] = None, aggregation: Optional[Union[str, transformers4rec.tf.tabular.base.TabularAggregation]] = None, name=None, **kwargs) → transformers4rec.tf.tabular.base.TabularBlock[source] Initializes a TabularLayer instance where the contents of features will be filtered out
- Parameters
features (List[str]) – A list of feature-names that will be used as the first pre-processing op to filter out all other features not in this list.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).
aggregation (Union[str, TabularAggregation], optional) –
Aggregation to apply after processing the call-method to output a single Tensor.
Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.
name (Optional[str]) – Name of the layer.
- Returns
- Return type
-
pre_call
(inputs: Dict[str, tensorflow.python.framework.ops.Tensor], transformations: Optional[Union[str, transformers4rec.tf.tabular.base.TabularTransformation, List[Union[str, transformers4rec.tf.tabular.base.TabularTransformation]]]] = None) → Dict[str, tensorflow.python.framework.ops.Tensor][source] Method that’s typically called before the forward method for pre-processing.
- Parameters
inputs (TabularData) – input-data, typically the output of the forward method.
transformations (TabularTransformationsType, optional) –
- Returns
- Return type
TabularData
-
call
(inputs: Dict[str, tensorflow.python.framework.ops.Tensor], **kwargs) → Dict[str, tensorflow.python.framework.ops.Tensor][source]
-
post_call
(inputs: Dict[str, tensorflow.python.framework.ops.Tensor], transformations: Optional[Union[str, transformers4rec.tf.tabular.base.TabularTransformation, List[Union[str, transformers4rec.tf.tabular.base.TabularTransformation]]]] = None, merge_with: Optional[Union[transformers4rec.tf.tabular.base.TabularBlock, List[transformers4rec.tf.tabular.base.TabularBlock]]] = None, aggregation: Optional[Union[str, transformers4rec.tf.tabular.base.TabularAggregation]] = None) → Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]][source] Method that’s typically called after the forward method for post-processing.
- Parameters
inputs (TabularData) – input-data, typically the output of the forward method.
transformations (TabularTransformationType, optional) – Transformations to apply on the input data.
merge_with (Union[TabularModule, List[TabularModule]], optional) – Other TabularModule’s to call and merge the outputs with.
aggregation (TabularAggregationType, optional) – Aggregation to aggregate the output to a single Tensor.
- Returns
- Return type
TensorOrTabularData (Tensor when aggregation is set, else TabularData)
-
set_pre
(value: Optional[Union[str, transformers4rec.tf.tabular.base.TabularTransformation, List[Union[str, transformers4rec.tf.tabular.base.TabularTransformation]]]])[source]
-
property
pre
returns: :rtype: SequentialTabularTransformations, optional
-
property
post
returns: :rtype: SequentialTabularTransformations, optional
-
set_post
(value: Optional[Union[str, transformers4rec.tf.tabular.base.TabularTransformation, List[Union[str, transformers4rec.tf.tabular.base.TabularTransformation]]]])[source]
-
property
aggregation
returns: :rtype: TabularAggregation, optional
-
set_aggregation
(value: Optional[Union[str, transformers4rec.tf.tabular.base.TabularAggregation]])[source] - Parameters
value –
-
merge
(other, aggregation=None, **kwargs)
-
class
transformers4rec.tf.
ContinuousFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.tf.features.base.InputBlock
Input block for continuous features.
- Parameters
features (List[str]) – List of continuous features to include in this module.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).
aggregation (Union[str, TabularAggregation], optional) –
Aggregation to apply after processing the call-method to output a single Tensor.
Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.
name (Optional[str]) – Name of the layer.
-
class
transformers4rec.tf.
EmbeddingFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.tf.features.base.InputBlock
Input block for embedding-lookups for categorical features.
For multi-hot features, the embeddings will be aggregated into a single tensor using the mean.
- Parameters
feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.
item_id (str, optional) – The name of the feature that’s used for the item_id.
- pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional
Transformations to apply on the inputs when the module is called (so before call).
- post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional
Transformations to apply on the inputs after the module is called (so after call).
- aggregation: Union[str, TabularAggregation], optional
Aggregation to apply after processing the call-method to output a single Tensor.
Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
- schema: Optional[DatasetSchema]
DatasetSchema containing the columns used in this block.
- name: Optional[str]
Name of the layer.
-
classmethod
from_schema
(schema: merlin_standard_lib.schema.schema.Schema, embedding_dims: Optional[Dict[str, int]] = None, embedding_dim_default: Optional[int] = 64, infer_embedding_sizes: bool = False, infer_embedding_sizes_multiplier: float = 2.0, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, combiner: Optional[str] = 'mean', tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]]]] = None, item_id: Optional[str] = None, max_sequence_length: Optional[int] = None, **kwargs) → Optional[transformers4rec.tf.features.embedding.EmbeddingFeatures][source]
-
call
(inputs: Dict[str, tensorflow.python.framework.ops.Tensor], **kwargs) → Dict[str, tensorflow.python.framework.ops.Tensor][source]
-
property
item_embedding_table
-
class
transformers4rec.tf.
SequenceEmbeddingFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.tf.features.embedding.EmbeddingFeatures
Input block for embedding-lookups for categorical features. This module produces 3-D tensors, this is useful for sequential models like transformers.
- Parameters
feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.
item_id (str, optional) – The name of the feature that’s used for the item_id.
padding_idx (int) – The symbol to use for padding.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).
aggregation (Union[str, TabularAggregation], optional) –
Aggregation to apply after processing the call-method to output a single Tensor.
Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.
name (Optional[str]) – Name of the layer.
-
class
transformers4rec.tf.
FeatureConfig
(table: tensorflow.python.tpu.tpu_embedding_v2_utils.TableConfig, max_sequence_length: int = 0, validate_weights_and_indices: bool = True, output_shape: Optional[Union[List[int], tensorflow.python.framework.tensor_shape.TensorShape]] = None, name: Optional[str] = None)[source] Bases:
object
Configuration data for one embedding feature.
This class holds the configuration data for a single embedding feature. The main use is to assign features to `tf.tpu.experimental.embedding.TableConfig`s via the table parameter:
```python table_config_one = tf.tpu.experimental.embedding.TableConfig(
vocabulary_size=…, dim=…)
- table_config_two = tf.tpu.experimental.embedding.TableConfig(
vocabulary_size=…, dim=…)
- feature_config = {
- ‘feature_one’: tf.tpu.experimental.embedding.FeatureConfig(
table=table_config_one),
- ‘feature_two’: tf.tpu.experimental.embedding.FeatureConfig(
table=table_config_one),
- ‘feature_three’: tf.tpu.experimental.embedding.FeatureConfig(
table=table_config_two)}
- embedding = tf.tpu.experimental.embedding.TPUEmbedding(
feature_config=feature_config, batch_size=… optimizer=tf.tpu.experimental.embedding.Adam(0.1))
The above configuration has 2 tables, and three features. The first two features will be looked up in the first table and the third feature will be looked up in the second table.
You can also specify the output shape for each feature. The output shape should be the expected activation shape excluding the table dimension. For dense and sparse tensor, the output shape should be the same as the input shape excluding the last dimension. For ragged tensor, the output shape can mismatch the input shape.
NOTE: The max_sequence_length will be only used when the input tensor has rank 2 and the output_shape is not set in the feature config.
When feeding features into embedding.enqueue they can be tf.Tensor`s, `tf.SparseTensor`s or `tf.RaggedTensor`s. When the argument `max_sequence_length is 0, the default, you should expect a output of embedding.dequeue for this feature of shape (batch_size, dim). If max_sequence_length is greater than 0, the feature is embedded as a sequence and padded up to the given length. The shape of the output for this feature will be (batch_size, max_sequence_length, dim).
-
class
transformers4rec.tf.
TableConfig
(vocabulary_size: int, dim: int, initializer: Optional[Callable[[Any], None]] = None, optimizer: Optional[tensorflow.python.tpu.tpu_embedding_v2_utils._Optimizer] = None, combiner: str = 'mean', name: Optional[str] = None)[source] Bases:
object
Configuration data for one embedding table.
This class holds the configuration data for a single embedding table. It is used as the table parameter of a tf.tpu.experimental.embedding.FeatureConfig. Multiple tf.tpu.experimental.embedding.FeatureConfig objects can use the same tf.tpu.experimental.embedding.TableConfig object. In this case a shared table will be created for those feature lookups.
```python table_config_one = tf.tpu.experimental.embedding.TableConfig(
vocabulary_size=…, dim=…)
- table_config_two = tf.tpu.experimental.embedding.TableConfig(
vocabulary_size=…, dim=…)
- feature_config = {
- ‘feature_one’: tf.tpu.experimental.embedding.FeatureConfig(
table=table_config_one),
- ‘feature_two’: tf.tpu.experimental.embedding.FeatureConfig(
table=table_config_one),
- ‘feature_three’: tf.tpu.experimental.embedding.FeatureConfig(
table=table_config_two)}
- embedding = tf.tpu.experimental.embedding.TPUEmbedding(
feature_config=feature_config, batch_size=… optimizer=tf.tpu.experimental.embedding.Adam(0.1))
The above configuration has 2 tables, and three features. The first two features will be looked up in the first table and the third feature will be looked up in the second table.
-
class
transformers4rec.tf.
TabularFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.tf.features.base.InputBlock
,transformers4rec.tf.tabular.base.MergeTabular
Input block that combines different types of features: continuous, categorical & text.
- Parameters
continuous_layer (TabularBlock, optional) – Block used to process continuous features.
categorical_layer (TabularBlock, optional) – Block used to process categorical features.
text_embedding_layer (TabularBlock, optional) – Block used to process text features.
- pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional
Transformations to apply on the inputs when the module is called (so before call).
- post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional
Transformations to apply on the inputs after the module is called (so after call).
- aggregation: Union[str, TabularAggregation], optional
Aggregation to apply after processing the call-method to output a single Tensor.
Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
- schema: Optional[DatasetSchema]
DatasetSchema containing the columns used in this block.
- name: Optional[str]
Name of the layer.
-
CONTINUOUS_MODULE_CLASS
alias of
transformers4rec.tf.features.continuous.ContinuousFeatures
-
EMBEDDING_MODULE_CLASS
alias of
transformers4rec.tf.features.embedding.EmbeddingFeatures
-
project_continuous_features
(mlp_layers_dims: Union[List[int], int]) → transformers4rec.tf.features.tabular.TabularFeatures[source] Combine all concatenated continuous features with stacked MLP layers
-
classmethod
from_schema
(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CATEGORICAL: 'categorical'>,), aggregation: Optional[str] = None, continuous_projection: Optional[Union[List[int], int]] = None, text_model=None, text_tags=<Tag.TEXT_TOKENIZED: 'text_tokenized'>, max_sequence_length=None, max_text_length=None, **kwargs)[source]
-
property
continuous_layer
-
property
categorical_layer
-
property
text_embedding_layer
-
class
transformers4rec.tf.
TabularSequenceFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.tf.features.tabular.TabularFeatures
Input module that combines different types of features to a sequence: continuous, categorical & text.
- Parameters
continuous_layer (TabularBlock, optional) – Block used to process continuous features.
categorical_layer (TabularBlock, optional) – Block used to process categorical features.
text_embedding_layer (TabularBlock, optional) – Block used to process text features.
projection_module (BlockOrModule, optional) – Module that’s used to project the output of this module, typically done by an MLPBlock.
masking (MaskSequence, optional) – Masking to apply to the inputs.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).
aggregation (Union[str, TabularAggregation], optional) –
Aggregation to apply after processing the call-method to output a single Tensor.
Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.
name (Optional[str]) – Name of the layer.
-
EMBEDDING_MODULE_CLASS
alias of
transformers4rec.tf.features.sequence.SequenceEmbeddingFeatures
-
classmethod
from_schema
(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags=(<Tag.CONTINUOUS: 'continuous'>, ), categorical_tags=(<Tag.CATEGORICAL: 'categorical'>, ), aggregation=None, max_sequence_length=None, continuous_projection=None, projection=None, d_output=None, masking=None, **kwargs) → transformers4rec.tf.features.sequence.TabularSequenceFeatures[source] Instantiates
TabularFeatures
from aDatasetSchema
- Parameters
schema (DatasetSchema) – Dataset schema
continuous_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the continuous features, by default Tag.CONTINUOUS
categorical_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the categorical features, by default Tag.CATEGORICAL
aggregation (Optional[str], optional) – Feature aggregation option, by default None
automatic_build (bool, optional) – Automatically infers input size from features, by default True
max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None
continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and project them by a number of MLP layers The argument accepts a list with the dimensions of the MLP layers, by default None
projection (Optional[torch.nn.Module, BuildableBlock], optional) – If set, project the aggregated embeddings vectors into hidden dimension vector space, by default None
d_output (Optional[int], optional) – If set, init a MLPBlock as projection module to project embeddings vectors, by default None
masking (Optional[Union[str, MaskSequence]], optional) – If set, Apply masking to the input embeddings and compute masked labels, It requires a categorical_module including an item_id column, by default None
- Returns
Returns
TabularFeatures
from a dataset schema- Return type
-
property
masking
-
property
item_id
-
property
item_embedding_table
-
class
transformers4rec.tf.
Head
(*args, **kwargs)[source] Bases:
keras.engine.base_layer.Layer
-
classmethod
from_schema
(schema: merlin_standard_lib.schema.schema.Schema, body: keras.engine.base_layer.Layer, task_blocks: Optional[Union[keras.engine.base_layer.Layer, Dict[str, keras.engine.base_layer.Layer]]] = None, task_weight_dict: Optional[Dict[str, float]] = None, loss_reduction=<function reduce_mean>, inputs: Optional[Union[transformers4rec.tf.features.sequence.TabularSequenceFeatures, transformers4rec.tf.features.tabular.TabularFeatures]] = None, **kwargs) → transformers4rec.tf.model.base.Head[source]
-
call
(body_outputs: tensorflow.python.framework.ops.Tensor, call_body=True, always_output_dict=False, **kwargs)[source]
-
compute_loss
(body_outputs, targets, training=False, call_body=True, compute_metrics=True, **kwargs) → tensorflow.python.framework.ops.Tensor[source]
-
property
task_blocks
-
property
metrics
-
classmethod
-
class
transformers4rec.tf.
AsDenseFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
,merlin_standard_lib.registry.RegistryMixin
[TabularTransformation
],abc.ABC
-
class
transformers4rec.tf.
AsSparseFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
,merlin_standard_lib.registry.RegistryMixin
[TabularTransformation
],abc.ABC
-
class
transformers4rec.tf.
ElementwiseSum
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
,merlin_standard_lib.registry.RegistryMixin
[TabularAggregation
],abc.ABC
-
class
transformers4rec.tf.
ElementwiseSumItemMulti
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
,merlin_standard_lib.registry.RegistryMixin
[TabularAggregation
],abc.ABC
-
call
(inputs: Dict[str, tensorflow.python.framework.ops.Tensor], **kwargs) → tensorflow.python.framework.ops.Tensor[source]
-
REQUIRES_SCHEMA
= True
-
-
class
transformers4rec.tf.
AsTabular
(*args, **kwargs)[source] Bases:
keras.engine.base_layer.Layer
Converts a Tensor to TabularData by converting it to a dictionary.
- Parameters
-
class
transformers4rec.tf.
ConcatFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
,merlin_standard_lib.registry.RegistryMixin
[TabularAggregation
],abc.ABC
-
class
transformers4rec.tf.
FilterFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
,merlin_standard_lib.registry.RegistryMixin
[TabularTransformation
],abc.ABC
Transformation that filters out certain features from TabularData.”
- Parameters
-
call
(inputs: Dict[str, tensorflow.python.framework.ops.Tensor], **kwargs) → Dict[str, tensorflow.python.framework.ops.Tensor][source] Filter out features from inputs.
- Parameters
inputs (TabularData) – Input dictionary containing features to filter.
Filtered TabularData that only contains the feature-names in self.to_include. (Returns) –
------- –
-
class
transformers4rec.tf.
MergeTabular
(*args, **kwargs)[source] Bases:
transformers4rec.tf.tabular.base.TabularBlock
Merge multiple TabularModule’s into a single output of TabularData.
- Parameters
blocks_to_merge (Union[TabularModule, Dict[str, TabularBlock]]) – TabularBlocks to merge into, this can also be one or multiple dictionaries keyed by the name the module should have.
{tabular_module_parameters} –
-
property
merge_values
-
property
to_merge_dict
-
class
transformers4rec.tf.
StackFeatures
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
,merlin_standard_lib.registry.RegistryMixin
[TabularAggregation
],abc.ABC
-
class
transformers4rec.tf.
PredictionTask
(*args, **kwargs)[source] Bases:
keras.engine.base_layer.Layer
,transformers4rec.tf.utils.tf_utils.LossMixin
,transformers4rec.tf.utils.tf_utils.MetricsMixin
-
property
task_name
-
property
-
class
transformers4rec.tf.
BinaryClassificationTask
(*args, **kwargs)[source] Bases:
transformers4rec.tf.model.base.PredictionTask
-
DEFAULT_LOSS
= <keras.losses.BinaryCrossentropy object>
-
DEFAULT_METRICS
= (<class 'keras.metrics.Precision'>, <class 'keras.metrics.Recall'>, <class 'keras.metrics.BinaryAccuracy'>, <class 'keras.metrics.AUC'>)
-
-
class
transformers4rec.tf.
NextItemPredictionTask
(*args, **kwargs)[source] Bases:
transformers4rec.tf.model.base.PredictionTask
Next-item prediction task.
- Parameters
loss – Loss function. SparseCategoricalCrossentropy()
metrics – List of RankingMetrics to be evaluated.
prediction_metrics – List of Keras metrics used to summarize the predictions.
label_metrics – List of Keras metrics used to summarize the labels.
loss_metrics – List of Keras metrics used to summarize the loss.
name – Optional task name.
target_dim (int) – Dimension of the target.
weight_tying (bool) – The item id embedding table weights are shared with the prediction network layer.
item_embedding_table (tf.Variable) – Variable of embedding table for the item.
softmax_temperature (float) – Softmax temperature, used to reduce model overconfidence, so that softmax(logits / T). Value 1.0 reduces to regular softmax.
-
DEFAULT_LOSS
= <keras.losses.SparseCategoricalCrossentropy object>
-
DEFAULT_METRICS
= (NDCGAt(), AvgPrecisionAt(), RecallAt())
-
class
transformers4rec.tf.
RegressionTask
(*args, **kwargs)[source] Bases:
transformers4rec.tf.model.base.PredictionTask
-
DEFAULT_LOSS
= <keras.losses.MeanSquaredError object>
-
DEFAULT_METRICS
= (<class 'keras.metrics.RootMeanSquaredError'>,)
-
-
class
transformers4rec.tf.
Model
(*args, **kwargs)[source] Bases:
transformers4rec.tf.model.base.BaseModel
-
class
transformers4rec.tf.
StochasticSwapNoise
(*args, **kwargs)[source] Bases:
transformers4rec.config.schema.SchemaMixin
,keras.engine.base_layer.Layer
,merlin_standard_lib.registry.RegistryMixin
[TabularTransformation
],abc.ABC
Applies Stochastic replacement of sequence features
-
call
(inputs: Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]], input_mask: Optional[tensorflow.python.framework.ops.Tensor] = None, training=True, **kwargs) → Union[tensorflow.python.framework.ops.Tensor, Dict[str, tensorflow.python.framework.ops.Tensor]][source]
-