transformers4rec.tf.features package

Submodules

transformers4rec.tf.features.base module

class transformers4rec.tf.features.base.InputBlock(*args, **kwargs)[source]: Bases: transformers4rec.tf.tabular.base.TabularBlock

transformers4rec.tf.features.continuous module

class transformers4rec.tf.features.continuous.ContinuousFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock

Input block for continuous features.

Parameters

features (List[str]) – List of continuous features to include in this module.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).
aggregation (Union[str, TabularAggregation], optional) –
Aggregation to apply after processing the call-method to output a single Tensor.

Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.
name (Optional[str]) – Name of the layer.

classmethod from_features(features, **kwargs)[source]

call(inputs, *args, **kwargs)[source]

compute_call_output_shape(input_shapes)[source]

get_config()[source]

repr_ignore() → List[str][source]

repr_extra()[source]

transformers4rec.tf.features.embedding module

class transformers4rec.tf.features.embedding.EmbeddingFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock

Input block for embedding-lookups for categorical features.

For multi-hot features, the embeddings will be aggregated into a single tensor using the mean.

Parameters

feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.
item_id (str, optional) – The name of the feature that’s used for the item_id.

pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs when the module is called (so before call).

post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs after the module is called (so after call).

aggregation: Union[str, TabularAggregation], optional

Aggregation to apply after processing the call-method to output a single Tensor.

Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.

schema: Optional[DatasetSchema]

DatasetSchema containing the columns used in this block.

name: Optional[str]

Name of the layer.

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, embedding_dims: Optional[Dict[str, int]] = None, embedding_dim_default: Optional[int] = 64, infer_embedding_sizes: bool = False, infer_embedding_sizes_multiplier: float = 2.0, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, combiner: Optional[str] = 'mean', tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]]]] = None, item_id: Optional[str] = None, max_sequence_length: Optional[int] = None, **kwargs) → Optional[transformers4rec.tf.features.embedding.EmbeddingFeatures][source]

build(input_shapes)[source]

call(inputs: Dict[str, tensorflow.python.framework.ops.Tensor], **kwargs) → Dict[str, tensorflow.python.framework.ops.Tensor][source]

compute_call_output_shape(input_shapes)[source]

property item_embedding_table

item_ids(inputs) → tensorflow.python.framework.ops.Tensor[source]

lookup_feature(name, val, output_sequence=False)[source]

get_config()[source]

classmethod from_config(config)[source]

transformers4rec.tf.features.embedding.serialize_table_config(table_config: tensorflow.python.tpu.tpu_embedding_v2_utils.TableConfig) → Dict[str, Any][source]

transformers4rec.tf.features.embedding.deserialize_table_config(table_params: Dict[str, Any]) → tensorflow.python.tpu.tpu_embedding_v2_utils.TableConfig[source]

transformers4rec.tf.features.embedding.serialize_feature_config(feature_config: tensorflow.python.tpu.tpu_embedding_v2_utils.FeatureConfig) → Dict[str, Any][source]

transformers4rec.tf.features.sequence module

class transformers4rec.tf.features.sequence.SequenceEmbeddingFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.embedding.EmbeddingFeatures

Input block for embedding-lookups for categorical features. This module produces 3-D tensors, this is useful for sequential models like transformers.

Parameters

feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.
item_id (str, optional) – The name of the feature that’s used for the item_id.
padding_idx (int) – The symbol to use for padding.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).
aggregation (Union[str, TabularAggregation], optional) –
Aggregation to apply after processing the call-method to output a single Tensor.

Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.
name (Optional[str]) – Name of the layer.

lookup_feature(name, val, **kwargs)[source]

compute_call_output_shape(input_shapes)[source]

compute_mask(inputs, mask=None)[source]

get_config()[source]

class transformers4rec.tf.features.sequence.TabularSequenceFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.tabular.TabularFeatures

Input module that combines different types of features to a sequence: continuous, categorical & text.

Parameters

continuous_layer (TabularBlock, optional) – Block used to process continuous features.
categorical_layer (TabularBlock, optional) – Block used to process categorical features.
text_embedding_layer (TabularBlock, optional) – Block used to process text features.
projection_module (BlockOrModule, optional) – Module that’s used to project the output of this module, typically done by an MLPBlock.
masking (MaskSequence, optional) – Masking to apply to the inputs.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before call).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after call).
aggregation (Union[str, TabularAggregation], optional) –
Aggregation to apply after processing the call-method to output a single Tensor.

Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.
schema (Optional[DatasetSchema]) – DatasetSchema containing the columns used in this block.
name (Optional[str]) – Name of the layer.

EMBEDDING_MODULE_CLASS: alias of transformers4rec.tf.features.sequence.SequenceEmbeddingFeatures

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags=(<Tag.CONTINUOUS: 'continuous'>, ), categorical_tags=(<Tag.CATEGORICAL: 'categorical'>, ), aggregation=None, max_sequence_length=None, continuous_projection=None, projection=None, d_output=None, masking=None, **kwargs) → transformers4rec.tf.features.sequence.TabularSequenceFeatures [source]

Instantiates TabularFeatures from a DatasetSchema

Parameters

schema (DatasetSchema) – Dataset schema
continuous_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the continuous features, by default Tag.CONTINUOUS
categorical_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the categorical features, by default Tag.CATEGORICAL
aggregation (Optional[str], optional) – Feature aggregation option, by default None
automatic_build (bool, optional) – Automatically infers input size from features, by default True
max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None
continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and project them by a number of MLP layers The argument accepts a list with the dimensions of the MLP layers, by default None
projection (Optional[torch.nn.Module, BuildableBlock], optional) – If set, project the aggregated embeddings vectors into hidden dimension vector space, by default None
d_output (Optional[int], optional) – If set, init a MLPBlock as projection module to project embeddings vectors, by default None
masking (Optional[Union[str, MaskSequence]], optional) – If set, Apply masking to the input embeddings and compute masked labels, It requires a categorical_module including an item_id column, by default None

Returns

Returns TabularFeatures from a dataset schema

Return type

TabularFeatures

project_continuous_features(dimensions)[source]

call(inputs, training=True)[source]

compute_call_output_shape(input_shape)[source]

compute_output_shape(input_shapes)[source]

property masking

set_masking(value)[source]

property item_id

property item_embedding_table

get_config()[source]

classmethod from_config(config, **kwargs)[source]

transformers4rec.tf.features.tabular module

class transformers4rec.tf.features.tabular.TabularFeatures(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock, transformers4rec.tf.tabular.base.MergeTabular

Input block that combines different types of features: continuous, categorical & text.

Parameters

continuous_layer (TabularBlock, optional) – Block used to process continuous features.
categorical_layer (TabularBlock, optional) – Block used to process categorical features.
text_embedding_layer (TabularBlock, optional) – Block used to process text features.

pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs when the module is called (so before call).

post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs after the module is called (so after call).

aggregation: Union[str, TabularAggregation], optional

Aggregation to apply after processing the call-method to output a single Tensor.

Next to providing a class that extends TabularAggregation, it’s also possible to provide the name that the class is registered in the tabular_aggregation_registry. Out of the box this contains: “concat”, “stack”, “element-wise-sum” & “element-wise-sum-item-multi”.

schema: Optional[DatasetSchema]

DatasetSchema containing the columns used in this block.

name: Optional[str]

Name of the layer.

CONTINUOUS_MODULE_CLASS: alias of transformers4rec.tf.features.continuous.ContinuousFeatures

EMBEDDING_MODULE_CLASS: alias of transformers4rec.tf.features.embedding.EmbeddingFeatures

project_continuous_features(mlp_layers_dims: Union[List[int], int]) → transformers4rec.tf.features.tabular.TabularFeatures [source]

Combine all concatenated continuous features with stacked MLP layers

Parameters: mlp_layers_dims (Union[List[int], int]) – The MLP layer dimensions
Returns: Returns the same TabularFeatures object with the continuous features projected
Return type: TabularFeatures

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CATEGORICAL: 'categorical'>,), aggregation: Optional[str] = None, continuous_projection: Optional[Union[List[int], int]] = None, text_model=None, text_tags=<Tag.TEXT_TOKENIZED: 'text_tokenized'>, max_sequence_length=None, max_text_length=None, **kwargs)[source]

property continuous_layer

property categorical_layer

property text_embedding_layer

get_config()[source]

classmethod from_config(config)[source]

transformers4rec.tf.features.text module

class transformers4rec.tf.features.text.ParseTokenizedText(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock

call(inputs, **kwargs)[source]

compute_output_shape(input_shapes)[source]

class transformers4rec.tf.features.text.TextEmbeddingFeaturesWithTransformers(*args, **kwargs)[source]

Bases: transformers4rec.tf.features.base.InputBlock

call(inputs, **kwargs)[source]

compute_output_shape(input_shapes)[source]

transformers4rec.tf.features package

Submodules

transformers4rec.tf.features.base module

transformers4rec.tf.features.continuous module

transformers4rec.tf.features.embedding module

transformers4rec.tf.features.sequence module

transformers4rec.tf.features.tabular module

transformers4rec.tf.features.text module

Module contents