transformers4rec.torch.features package

Submodules

transformers4rec.torch.features.base module

class transformers4rec.torch.features.base.InputBlock(pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]: Bases: transformers4rec.torch.tabular.base.TabularBlock, abc.ABC

transformers4rec.torch.features.continuous module

class transformers4rec.torch.features.continuous.ContinuousFeatures(features: List[str], pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]

Bases: transformers4rec.torch.features.base.InputBlock

Input block for continuous features.

Parameters

features (List[str]) – List of continuous features to include in this module.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).
aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

classmethod from_features(features, **kwargs)[source]

forward(inputs, **kwargs)[source]

forward_output_size(input_sizes)[source]

transformers4rec.torch.features.embedding module

class transformers4rec.torch.features.embedding.EmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], item_id: Optional[str] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None)[source]

Bases: transformers4rec.torch.features.base.InputBlock

Input block for embedding-lookups for categorical features.

For multi-hot features, the embeddings will be aggregated into a single tensor using the mean.

Parameters

feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.
item_id (str, optional) – The name of the feature that’s used for the item_id.

pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional: Transformations to apply on the inputs when the module is called (so before forward).
post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional: Transformations to apply on the inputs after the module is called (so after forward).
aggregation: Union[str, TabularAggregation], optional: Aggregation to apply after processing the forward-method to output a single Tensor.

property item_embedding_table

table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig) → torch.nn.modules.module.Module [source]

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, embedding_dims: Optional[Dict[str, int]] = None, embedding_dim_default: int = 64, infer_embedding_sizes: bool = False, infer_embedding_sizes_multiplier: float = 2.0, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, combiner: str = 'mean', tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]]]] = None, item_id: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, aggregation=None, pre=None, post=None, **kwargs) → Optional[transformers4rec.torch.features.embedding.EmbeddingFeatures][source]

Instantitates EmbeddingFeatures from a DatasetSchema.

Parameters

schema (DatasetSchema) – Dataset schema
embedding_dims (Optional[Dict[str, int]], optional) – The dimension of the embedding table for each feature (key), by default None by default None
default_embedding_dim (Optional[int], optional) – Default dimension of the embedding table, when the feature is not found in default_soft_embedding_dim, by default 64
infer_embedding_sizes (bool, optional) – Automatically defines the embedding dimension from the feature cardinality in the schema, by default False
infer_embedding_sizes_multiplier (Optional[int], by default 2.0) – multiplier used by the heuristic to infer the embedding dimension from its cardinality. Generally reasonable values range between 2.0 and 10.0
embeddings_initializers (Optional[Dict[str, Callable[[Any], None]]]) – Dict where keys are feature names and values are callable to initialize embedding tables
combiner (Optional[str], optional) – Feature aggregation option, by default “mean”
tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter columns, by default None
item_id (Optional[str], optional) – Name of the item id column (feature), by default None
automatic_build (bool, optional) – Automatically infers input size from features, by default True
max_sequence_length (Optional[int], optional) – Maximum sequence length for list features,, by default None

Returns

Returns the EmbeddingFeatures for the dataset schema

Return type

Optional[EmbeddingFeatures]

item_ids(inputs) → torch.Tensor [source]

forward(inputs, **kwargs)[source]

forward_output_size(input_sizes)[source]

class transformers4rec.torch.features.embedding.EmbeddingBagWrapper(num_embeddings: int, embedding_dim: int, max_norm: Optional[float] = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, mode: str = 'mean', sparse: bool = False, _weight: Optional[torch.Tensor] = None, include_last_offset: bool = False, padding_idx: Optional[int] = None, device=None, dtype=None)[source]

Bases: torch.nn.modules.sparse.EmbeddingBag

Wrapper class for the PyTorch EmbeddingBag module.

This class extends the torch.nn.EmbeddingBag class and overrides the forward method to handle 1D tensor inputs by reshaping them to 2D as required by the EmbeddingBag.

forward(input, **kwargs)[source]

num_embeddings: int

embedding_dim: int

max_norm: Optional[float]

norm_type: float

scale_grad_by_freq: bool

weight: torch.Tensor

mode: str

sparse: bool

include_last_offset: bool

padding_idx: Optional[int]

class transformers4rec.torch.features.embedding.SoftEmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], layer_norm: bool = True, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, **kwarg)[source]

Bases: transformers4rec.torch.features.embedding.EmbeddingFeatures

Encapsulate continuous features encoded using the Soft-one hot encoding embedding technique (SoftEmbedding), from https://arxiv.org/pdf/1708.00065.pdf In a nutshell, it keeps an embedding table for each continuous feature, which is represented as a weighted average of embeddings.

Parameters

feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.
layer_norm (boolean) – When layer_norm is true, TabularLayerNorm will be used in post.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).
aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, soft_embedding_cardinalities: Optional[Dict[str, int]] = None, soft_embedding_cardinality_default: int = 10, soft_embedding_dims: Optional[Dict[str, int]] = None, soft_embedding_dim_default: int = 8, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, layer_norm: bool = True, combiner: str = 'mean', tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]]]] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, **kwargs) → Optional[transformers4rec.torch.features.embedding.SoftEmbeddingFeatures][source]

Instantitates SoftEmbeddingFeatures from a DatasetSchema.

Parameters

schema (DatasetSchema) – Dataset schema
soft_embedding_cardinalities (Optional[Dict[str, int]], optional) – The cardinality of the embedding table for each feature (key), by default None
soft_embedding_cardinality_default (Optional[int], optional) – Default cardinality of the embedding table, when the feature is not found in soft_embedding_cardinalities, by default 10
soft_embedding_dims (Optional[Dict[str, int]], optional) – The dimension of the embedding table for each feature (key), by default None
soft_embedding_dim_default (Optional[int], optional) – Default dimension of the embedding table, when the feature is not found in soft_embedding_dim_default, by default 8
embeddings_initializers (Optional[Dict[str, Callable[[Any], None]]]) – Dict where keys are feature names and values are callable to initialize embedding tables
combiner (Optional[str], optional) – Feature aggregation option, by default “mean”
tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter columns, by default None
automatic_build (bool, optional) – Automatically infers input size from features, by default True
max_sequence_length (Optional[int], optional) – Maximum sequence length for list features, by default None

Returns

Returns a SoftEmbeddingFeatures instance from the dataset schema

Return type

Optional[SoftEmbeddingFeatures]

table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig) → transformers4rec.torch.features.embedding.SoftEmbedding [source]

class transformers4rec.torch.features.embedding.TableConfig(vocabulary_size: int, dim: int, initializer: Optional[Callable[[torch.Tensor], None]] = None, combiner: str = 'mean', name: Optional[str] = None)[source]

Bases: object

Class to configure the embeddings lookup table for a categorical feature.

vocabulary_size

The size of the vocabulary, i.e., the cardinality of the categorical feature.

Type: int

dim

The dimensionality of the embedding vectors.

Type: int

initializer

The initializer function for the embedding weights. If None, the weights are initialized using a normal distribution with mean 0.0 and standard deviation 0.05.

Type: Optional[Callable[[torch.Tensor], None]]

combiner

The combiner operation used to aggregate bag of embeddings. Possible options are “mean”, “sum”, and “sqrtn”. By default “mean”.

Type: Optional[str]

name

The name of the lookup table. By default None.

Type: Optional[str]

class transformers4rec.torch.features.embedding.FeatureConfig(table: transformers4rec.torch.features.embedding.TableConfig, max_sequence_length: int = 0, name: Optional[str] = None)[source]

Bases: object

Class to set the embeddings table of a categorical feature with a maximum sequence length.

table

Configuration for the lookup table, which is used for embedding lookup and aggregation.

Type: TableConfig

max_sequence_length

Maximum sequence length for sequence features. By default 0.

Type: int, optional

name

The feature name. By default None

Type: str, optional

class transformers4rec.torch.features.embedding.SoftEmbedding(num_embeddings, embeddings_dim, emb_initializer=None)[source]

Bases: torch.nn.modules.module.Module

Soft-one hot encoding embedding technique, from https://arxiv.org/pdf/1708.00065.pdf In a nutshell, it represents a continuous feature as a weighted average of embeddings

forward(input_numeric)[source]

training: bool

class transformers4rec.torch.features.embedding.PretrainedEmbeddingsInitializer(weight_matrix: Union[torch.Tensor, List[List[float]]], trainable: bool = False, **kwargs)[source]

Bases: torch.nn.modules.module.Module

Initializer of embedding tables with pre-trained weights

Parameters

weight_matrix (Union[torch.Tensor, List[List[float]]]) – A 2D torch or numpy tensor or lists of lists with the pre-trained weights for embeddings. The expect dims are (embedding_cardinality, embedding_dim). The embedding_cardinality can be inferred from the column schema, for example, schema.select_by_name(“item_id”).feature[0].int_domain.max + 1. The first position of the embedding table is reserved for padded items (id=0).
trainable (bool) – Whether the embedding table should be trainable or not

forward(x)[source]

training: bool

class transformers4rec.torch.features.embedding.PretrainedEmbeddingFeatures(features: List[str], pretrained_output_dims: Optional[Union[int, Dict[str, int]]] = None, sequence_combiner: Optional[Union[str, torch.nn.modules.module.Module]] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, normalizer: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None)[source]

Bases: transformers4rec.torch.features.base.InputBlock

Input block for pre-trained embeddings features.

For 3-D features, if sequence_combiner is set, the features are aggregated using the second dimension (sequence length)

Parameters

features (List[str]) – A list of the pre-trained embeddings feature names. You typically will pass schema.select_by_tag(Tags.EMBEDDING).column_names, as that is the tag added to pre-trained embedding features when using the merlin.dataloader.ops.embeddings.EmbeddingOperator
pretrained_output_dims (Optional[Union[int, Dict[str, int]]]) – If provided, it projects features to specified dim(s). If an int, all features are projected to that dim. If a dict, only features provided in the dict will be mapped to the specified dim,
sequence_combiner (Optional[Union[str, torch.nn.Module]], optional) – A string (“mean”, “sum”, “max”, “min”) or torch.nn.Module specifying how to combine the second dimension of the pre-trained embeddings if it is 3D. Default is None (no sequence combiner used)
normalizer (Optional[Union[str, TabularTransformationType]]) – A tabular layer (e.g.tr.TabularLayerNorm()) or string (“layer-norm”) to be applied to pre-trained embeddings after projected and sequence combined Default is None (no normalization)
(Optional[Schema]) (schema) –
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).
aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

build(input_size, **kwargs)[source]

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]]]] = None, pretrained_output_dims=None, sequence_combiner=None, normalizer: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, **kwargs)[source]

forward(inputs)[source]

forward_output_size(input_sizes)[source]

parse_combiner(combiner)[source]

transformers4rec.torch.features.sequence module

class transformers4rec.torch.features.sequence.SequenceEmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], item_id: Optional[str] = None, padding_idx: int = 0, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None)[source]

Bases: transformers4rec.torch.features.embedding.EmbeddingFeatures

Input block for embedding-lookups for categorical features. This module produces 3-D tensors, this is useful for sequential models like transformers.

Parameters

feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.
item_id (str, optional) – The name of the feature that’s used for the item_id.
padding_idx (int) – The symbol to use for padding.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).
aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig) → torch.nn.modules.sparse.Embedding [source]

forward_output_size(input_sizes)[source]

class transformers4rec.torch.features.sequence.TabularSequenceFeatures(continuous_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, categorical_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, pretrained_embedding_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, projection_module: Optional[Union[transformers4rec.torch.block.base.BlockBase, transformers4rec.torch.block.base.BuildableBlock, torch.nn.modules.module.Module]] = None, masking: Optional[transformers4rec.torch.masking.MaskSequence] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]

Bases: transformers4rec.torch.features.tabular.TabularFeatures

Input module that combines different types of features to a sequence: continuous, categorical & text.

Parameters

continuous_module (TabularModule, optional) – Module used to process continuous features.
categorical_module (TabularModule, optional) – Module used to process categorical features.
text_embedding_module (TabularModule, optional) – Module used to process text features.
projection_module (BlockOrModule, optional) – Module that’s used to project the output of this module, typically done by an MLPBlock.
masking (MaskSequence, optional) – Masking to apply to the inputs.
pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).
post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).
aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

EMBEDDING_MODULE_CLASS: alias of transformers4rec.torch.features.sequence.SequenceEmbeddingFeatures

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.CATEGORICAL: 'categorical'>,), pretrained_embeddings_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.EMBEDDING: 'embedding'>,), aggregation: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, continuous_projection: Optional[Union[int, List[int]]] = None, continuous_soft_embeddings: bool = False, projection: Optional[Union[torch.nn.modules.module.Module, transformers4rec.torch.block.base.BuildableBlock]] = None, d_output: Optional[int] = None, masking: Optional[Union[str, transformers4rec.torch.masking.MaskSequence]] = None, **kwargs) → transformers4rec.torch.features.sequence.TabularSequenceFeatures [source]

Instantiates TabularFeatures from a DatasetSchema

Parameters

schema (DatasetSchema) – Dataset schema
continuous_tags (Optional[Union[TagsType, Tuple[Tags]]], optional) – Tags to filter the continuous features, by default Tags.CONTINUOUS
categorical_tags (Optional[Union[TagsType, Tuple[Tags]]], optional) – Tags to filter the categorical features, by default Tags.CATEGORICAL
aggregation (Optional[str], optional) – Feature aggregation option, by default None
automatic_build (bool, optional) – Automatically infers input size from features, by default True
max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None
continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and project them by a number of MLP layers. The argument accepts a list with the dimensions of the MLP layers, by default None
continuous_soft_embeddings (bool) – Indicates if the soft one-hot encoding technique must be used to represent continuous features, by default False
projection (Optional[Union[torch.nn.Module, BuildableBlock]], optional) – If set, project the aggregated embeddings vectors into hidden dimension vector space, by default None
d_output (Optional[int], optional) – If set, init a MLPBlock as projection module to project embeddings vectors, by default None
masking (Optional[Union[str, MaskSequence]], optional) – If set, Apply masking to the input embeddings and compute masked labels, It requires a categorical_module including an item_id column, by default None

Returns

Returns TabularFeatures from a dataset schema

Return type

TabularFeatures

property masking

set_masking(value)[source]

property item_id

property item_embedding_table

forward(inputs, training=False, testing=False, **kwargs)[source]

project_continuous_features(dimensions)[source]

forward_output_size(input_size)[source]

transformers4rec.torch.features.tabular module

class transformers4rec.torch.features.tabular.TabularFeatures(continuous_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, categorical_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, pretrained_embedding_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]

Bases: transformers4rec.torch.tabular.base.MergeTabular

Input module that combines different types of features: continuous, categorical & text.

Parameters

continuous_module (TabularModule, optional) – Module used to process continuous features.
categorical_module (TabularModule, optional) – Module used to process categorical features.
text_embedding_module (TabularModule, optional) – Module used to process text features.

pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional: Transformations to apply on the inputs when the module is called (so before forward).
post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional: Transformations to apply on the inputs after the module is called (so after forward).
aggregation: Union[str, TabularAggregation], optional: Aggregation to apply after processing the forward-method to output a single Tensor.

CONTINUOUS_MODULE_CLASS: alias of transformers4rec.torch.features.continuous.ContinuousFeatures

EMBEDDING_MODULE_CLASS: alias of transformers4rec.torch.features.embedding.EmbeddingFeatures

SOFT_EMBEDDING_MODULE_CLASS: alias of transformers4rec.torch.features.embedding.SoftEmbeddingFeatures

PRETRAINED_EMBEDDING_MODULE_CLASS: alias of transformers4rec.torch.features.embedding.PretrainedEmbeddingFeatures

project_continuous_features(mlp_layers_dims: Union[List[int], int]) → transformers4rec.torch.features.tabular.TabularFeatures [source]

Combine all concatenated continuous features with stacked MLP layers

Parameters: mlp_layers_dims (Union[List[int], int]) – The MLP layer dimensions
Returns: Returns the same TabularFeatures object with the continuous features projected
Return type: TabularFeatures

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.CATEGORICAL: 'categorical'>,), pretrained_embeddings_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.EMBEDDING: 'embedding'>,), aggregation: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, continuous_projection: Optional[Union[int, List[int]]] = None, continuous_soft_embeddings: bool = False, **kwargs) → transformers4rec.torch.features.tabular.TabularFeatures [source]

Instantiates TabularFeatures from a DatasetSchema

Parameters

schema (DatasetSchema) – Dataset schema
continuous_tags (Optional[Union[TagsType, Tuple[Tags]]], optional) – Tags to filter the continuous features, by default Tags.CONTINUOUS
categorical_tags (Optional[Union[TagsType, Tuple[Tags]]], optional) – Tags to filter the categorical features, by default Tags.CATEGORICAL
aggregation (Optional[str], optional) – Feature aggregation option, by default None
automatic_build (bool, optional) – Automatically infers input size from features, by default True
max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None
continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and project them by a number of MLP layers. The argument accepts a list with the dimensions of the MLP layers, by default None
continuous_soft_embeddings (bool) – Indicates if the soft one-hot encoding technique must be used to represent continuous features, by default False

Returns

Returns TabularFeatures from a dataset schema

Return type

TabularFeatures

forward_output_size(input_size)[source]

property continuous_module

property categorical_module

property pretrained_module

transformers4rec.torch.features package

Submodules

transformers4rec.torch.features.base module

transformers4rec.torch.features.continuous module

transformers4rec.torch.features.embedding module

transformers4rec.torch.features.sequence module

transformers4rec.torch.features.tabular module

transformers4rec.torch.features.text module

Module contents