transformers4rec.tf.features package

Submodules

transformers4rec.tf.features.base module

class transformers4rec.tf.features.base.InputBlock(*args, **kwargs)[source]

Bases: transformers4rec.tf.tabular.base.TabularBlock

transformers4rec.tf.features.continuous module

class transformers4rec.tf.features.continuous.ContinuousFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock

Input block for continuous features.

Parameters
  • features (List[str]) – List of continuous features to include in this module.

  • pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).

  • post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).

  • aggregation (Union[str, TabularAggregation], optional) –

    Aggregation to apply after processing the call-method to output a single Tensor.

    Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.

  • schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.

  • name (Optional[str]) – Name of the layer.

classmethod from_features(features, **kwargs)[source]
call(inputs, *args, **kwargs)[source]
compute_call_output_shape(input_shapes)[source]
get_config()[source]
repr_ignore()List[str][source]
repr_extra()[source]

transformers4rec.tf.features.embedding module

class transformers4rec.tf.features.embedding.EmbeddingFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock

Input block for embedding-lookups for categorical features.

For multi-hot features, the embeddings will be aggregated into a single tensor using the mean.

Parameters
  • feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.

  • item_id (str, optional) – The name of the feature that’s used for the item_id.

pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs when the module is called (so before call).

post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs after the module is called (so after call).

aggregation: Union[str, TabularAggregation], optional

Aggregation to apply after processing the call-method to output a single Tensor.

Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.

schema: Optional[DatasetSchema]

DatasetSchema containing the columns used in this block.

name: Optional[str]

Name of the layer.

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, embedding_dims: Optional[Dict[str, int]] = None, embedding_dim_default: Optional[int] = 64, infer_embedding_sizes: bool = False, infer_embedding_sizes_multiplier: float = 2.0, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, combiner: Optional[str] = 'mean', tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]]]] = None, item_id: Optional[str] = None, max_sequence_length: Optional[int] = None, **kwargs)Optional[transformers4rec.tf.features.embedding.EmbeddingFeatures][source]
build(input_shapes)[source]
call(inputs: Dict[str, tensorflow.python.framework.ops.Tensor], **kwargs)Dict[str, tensorflow.python.framework.ops.Tensor][source]
compute_call_output_shape(input_shapes)[source]
property item_embedding_table
item_ids(inputs)tensorflow.python.framework.ops.Tensor[source]
lookup_feature(name, val, output_sequence=False)[source]
get_config()[source]
classmethod from_config(config)[source]
transformers4rec.tf.features.embedding.serialize_table_config(table_config: tensorflow.python.tpu.tpu_embedding_v2_utils.TableConfig)Dict[str, Any][source]
transformers4rec.tf.features.embedding.deserialize_table_config(table_params: Dict[str, Any])tensorflow.python.tpu.tpu_embedding_v2_utils.TableConfig[source]
transformers4rec.tf.features.embedding.serialize_feature_config(feature_config: tensorflow.python.tpu.tpu_embedding_v2_utils.FeatureConfig)Dict[str, Any][source]

transformers4rec.tf.features.sequence module

class transformers4rec.tf.features.sequence.SequenceEmbeddingFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.embedding.EmbeddingFeatures

Input block for embedding-lookups for categorical features. This module produces 3-D tensors, this is useful for sequential models like transformers.

Parameters
  • feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.

  • item_id (str, optional) – The name of the feature that’s used for the item_id.

  • padding_idx (int) – The symbol to use for padding.

  • pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).

  • post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).

  • aggregation (Union[str, TabularAggregation], optional) –

    Aggregation to apply after processing the call-method to output a single Tensor.

    Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.

  • schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.

  • name (Optional[str]) – Name of the layer.

lookup_feature(name, val, **kwargs)[source]
compute_call_output_shape(input_shapes)[source]
compute_mask(inputs, mask=None)[source]
get_config()[source]
class transformers4rec.tf.features.sequence.TabularSequenceFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.tabular.TabularFeatures

Input module that combines different types of features to a sequence: continuous, categorical & text.

Parameters
  • continuous_layer (TabularBlock, optional) – Block used to process continuous features.

  • categorical_layer (TabularBlock, optional) – Block used to process categorical features.

  • text_embedding_layer (TabularBlock, optional) – Block used to process text features.

  • projection_module (BlockOrModule, optional) – Module that’s used to project the output of this module, typically done by an MLPBlock.

  • masking (MaskSequence, optional) – Masking to apply to the inputs.

  • pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).

  • post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).

  • aggregation (Union[str, TabularAggregation], optional) –

    Aggregation to apply after processing the call-method to output a single Tensor.

    Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.

  • schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.

  • name (Optional[str]) – Name of the layer.

EMBEDDING_MODULE_CLASS

alias of transformers4rec.tf.features.sequence.SequenceEmbeddingFeatures

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags=(<Tag.CONTINUOUS: 'continuous'>, ), categorical_tags=(<Tag.CATEGORICAL: 'categorical'>, ), aggregation=None, max_sequence_length=None, continuous_projection=None, projection=None, d_output=None, masking=None, **kwargs)transformers4rec.tf.features.sequence.TabularSequenceFeatures[source]

Instantiates TabularFeatures from a DatasetSchema

Parameters
  • schema (DatasetSchema) – Dataset schema

  • continuous_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the continuous features, by default Tag.CONTINUOUS

  • categorical_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the categorical features, by default Tag.CATEGORICAL

  • aggregation (Optional[str], optional) – Feature aggregation option, by default None

  • automatic_build (bool, optional) – Automatically infers input size from features, by default True

  • max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None

  • continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and project them by a number of MLP layers The argument accepts a list with the dimensions of the MLP layers, by default None

  • projection (Optional[torch.nn.Module, BuildableBlock], optional) – If set, project the aggregated embeddings vectors into hidden dimension vector space, by default None

  • d_output (Optional[int], optional) – If set, init a MLPBlock as projection module to project embeddings vectors, by default None

  • masking (Optional[Union[str, MaskSequence]], optional) – If set, Apply masking to the input embeddings and compute masked labels, It requires a categorical_module including an item_id column, by default None

Returns

Returns TabularFeatures from a dataset schema

Return type

TabularFeatures

project_continuous_features(dimensions)[source]
call(inputs, training=True)[source]
compute_call_output_shape(input_shape)[source]
compute_output_shape(input_shapes)[source]
property masking
set_masking(value)[source]
property item_id
property item_embedding_table
get_config()[source]
classmethod from_config(config, **kwargs)[source]

transformers4rec.tf.features.tabular module

class transformers4rec.tf.features.tabular.TabularFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock, transformers4rec.tf.tabular.base.MergeTabular

Input block that combines different types of features: continuous, categorical & text.

Parameters
  • continuous_layer (TabularBlock, optional) – Block used to process continuous features.

  • categorical_layer (TabularBlock, optional) – Block used to process categorical features.

  • text_embedding_layer (TabularBlock, optional) – Block used to process text features.

pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs when the module is called (so before call).

post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs after the module is called (so after call).

aggregation: Union[str, TabularAggregation], optional

Aggregation to apply after processing the call-method to output a single Tensor.

Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.

schema: Optional[DatasetSchema]

DatasetSchema containing the columns used in this block.

name: Optional[str]

Name of the layer.

CONTINUOUS_MODULE_CLASS

alias of transformers4rec.tf.features.continuous.ContinuousFeatures

EMBEDDING_MODULE_CLASS

alias of transformers4rec.tf.features.embedding.EmbeddingFeatures

project_continuous_features(mlp_layers_dims: Union[List[int], int])transformers4rec.tf.features.tabular.TabularFeatures[source]

Combine all concatenated continuous features with stacked MLP layers

Parameters

mlp_layers_dims (Union[List[int], int]) – The MLP layer dimensions

Returns

Returns the same TabularFeatures object with the continuous features projected

Return type

TabularFeatures

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CATEGORICAL: 'categorical'>,), aggregation: Optional[str] = None, continuous_projection: Optional[Union[List[int], int]] = None, text_model=None, text_tags=<Tag.TEXT_TOKENIZED: 'text_tokenized'>, max_sequence_length=None, max_text_length=None, **kwargs)[source]
property continuous_layer
property categorical_layer
property text_embedding_layer
get_config()[source]
classmethod from_config(config)[source]

transformers4rec.tf.features.text module

class transformers4rec.tf.features.text.ParseTokenizedText(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock

call(inputs, **kwargs)[source]
compute_output_shape(input_shapes)[source]
class transformers4rec.tf.features.text.TextEmbeddingFeaturesWithTransformers(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock

call(inputs, **kwargs)[source]
compute_output_shape(input_shapes)[source]

Module contents