transformers4rec.torch.features package

Submodules

transformers4rec.torch.features.base module

class transformers4rec.torch.features.base.InputBlock(pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]

Bases: transformers4rec.torch.tabular.base.TabularBlock, abc.ABC

transformers4rec.torch.features.continuous module

class transformers4rec.torch.features.continuous.ContinuousFeatures(features: List[str], pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]

Bases: transformers4rec.torch.features.base.InputBlock

Input block for continuous features.

Parameters
classmethod from_features(features, **kwargs)[source]
forward(inputs, **kwargs)[source]
forward_output_size(input_sizes)[source]

transformers4rec.torch.features.embedding module

class transformers4rec.torch.features.embedding.EmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], item_id: Optional[str] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None)[source]

Bases: transformers4rec.torch.features.base.InputBlock

Input block for embedding-lookups for categorical features.

For multi-hot features, the embeddings will be aggregated into a single tensor using the mean.

Parameters
  • feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.

  • item_id (str, optional) – The name of the feature that’s used for the item_id.

pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs when the module is called (so before forward).

post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs after the module is called (so after forward).

aggregation: Union[str, TabularAggregation], optional

Aggregation to apply after processing the forward-method to output a single Tensor.

property item_embedding_table
table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig)torch.nn.modules.module.Module[source]
classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, embedding_dims: Optional[Dict[str, int]] = None, embedding_dim_default: int = 64, infer_embedding_sizes: bool = False, infer_embedding_sizes_multiplier: float = 2.0, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, combiner: str = 'mean', tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]]]] = None, item_id: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, aggregation=None, pre=None, post=None, **kwargs)Optional[transformers4rec.torch.features.embedding.EmbeddingFeatures][source]

Instantitates EmbeddingFeatures from a DatasetSchema.

Parameters
  • schema (DatasetSchema) – Dataset schema

  • embedding_dims (Optional[Dict[str, int]], optional) – The dimension of the embedding table for each feature (key), by default None by default None

  • default_embedding_dim (Optional[int], optional) – Default dimension of the embedding table, when the feature is not found in default_soft_embedding_dim, by default 64

  • infer_embedding_sizes (bool, optional) – Automatically defines the embedding dimension from the feature cardinality in the schema, by default False

  • infer_embedding_sizes_multiplier (Optional[int], by default 2.0) – multiplier used by the heuristic to infer the embedding dimension from its cardinality. Generally reasonable values range between 2.0 and 10.0

  • embeddings_initializers (Optional[Dict[str, Callable[[Any], None]]]) – Dict where keys are feature names and values are callable to initialize embedding tables

  • combiner (Optional[str], optional) – Feature aggregation option, by default “mean”

  • tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter columns, by default None

  • item_id (Optional[str], optional) – Name of the item id column (feature), by default None

  • automatic_build (bool, optional) – Automatically infers input size from features, by default True

  • max_sequence_length (Optional[int], optional) – Maximum sequence length for list features,, by default None

Returns

Returns the EmbeddingFeatures for the dataset schema

Return type

Optional[EmbeddingFeatures]

item_ids(inputs)torch.Tensor[source]
forward(inputs, **kwargs)[source]
forward_output_size(input_sizes)[source]
class transformers4rec.torch.features.embedding.EmbeddingBagWrapper(num_embeddings: int, embedding_dim: int, max_norm: Optional[float] = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, mode: str = 'mean', sparse: bool = False, _weight: Optional[torch.Tensor] = None, include_last_offset: bool = False, padding_idx: Optional[int] = None, device=None, dtype=None)[source]

Bases: torch.nn.modules.sparse.EmbeddingBag

Wrapper class for the PyTorch EmbeddingBag module.

This class extends the torch.nn.EmbeddingBag class and overrides the forward method to handle 1D tensor inputs by reshaping them to 2D as required by the EmbeddingBag.

forward(input, **kwargs)[source]
num_embeddings: int
embedding_dim: int
max_norm: Optional[float]
norm_type: float
scale_grad_by_freq: bool
weight: torch.Tensor
mode: str
sparse: bool
include_last_offset: bool
padding_idx: Optional[int]
class transformers4rec.torch.features.embedding.SoftEmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], layer_norm: bool = True, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, **kwarg)[source]

Bases: transformers4rec.torch.features.embedding.EmbeddingFeatures

Encapsulate continuous features encoded using the Soft-one hot encoding embedding technique (SoftEmbedding), from https://arxiv.org/pdf/1708.00065.pdf In a nutshell, it keeps an embedding table for each continuous feature, which is represented as a weighted average of embeddings.

Parameters
  • feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.

  • layer_norm (boolean) – When layer_norm is true, TabularLayerNorm will be used in post.

  • pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).

  • post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).

  • aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, soft_embedding_cardinalities: Optional[Dict[str, int]] = None, soft_embedding_cardinality_default: int = 10, soft_embedding_dims: Optional[Dict[str, int]] = None, soft_embedding_dim_default: int = 8, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, layer_norm: bool = True, combiner: str = 'mean', tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]]]] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, **kwargs)Optional[transformers4rec.torch.features.embedding.SoftEmbeddingFeatures][source]

Instantitates SoftEmbeddingFeatures from a DatasetSchema.

Parameters
  • schema (DatasetSchema) – Dataset schema

  • soft_embedding_cardinalities (Optional[Dict[str, int]], optional) – The cardinality of the embedding table for each feature (key), by default None

  • soft_embedding_cardinality_default (Optional[int], optional) – Default cardinality of the embedding table, when the feature is not found in soft_embedding_cardinalities, by default 10

  • soft_embedding_dims (Optional[Dict[str, int]], optional) – The dimension of the embedding table for each feature (key), by default None

  • soft_embedding_dim_default (Optional[int], optional) – Default dimension of the embedding table, when the feature is not found in soft_embedding_dim_default, by default 8

  • embeddings_initializers (Optional[Dict[str, Callable[[Any], None]]]) – Dict where keys are feature names and values are callable to initialize embedding tables

  • combiner (Optional[str], optional) – Feature aggregation option, by default “mean”

  • tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter columns, by default None

  • automatic_build (bool, optional) – Automatically infers input size from features, by default True

  • max_sequence_length (Optional[int], optional) – Maximum sequence length for list features, by default None

Returns

Returns a SoftEmbeddingFeatures instance from the dataset schema

Return type

Optional[SoftEmbeddingFeatures]

table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig)transformers4rec.torch.features.embedding.SoftEmbedding[source]
class transformers4rec.torch.features.embedding.TableConfig(vocabulary_size: int, dim: int, initializer: Optional[Callable[[torch.Tensor], None]] = None, combiner: str = 'mean', name: Optional[str] = None)[source]

Bases: object

Class to configure the embeddings lookup table for a categorical feature.

vocabulary_size

The size of the vocabulary, i.e., the cardinality of the categorical feature.

Type

int

dim

The dimensionality of the embedding vectors.

Type

int

initializer

The initializer function for the embedding weights. If None, the weights are initialized using a normal distribution with mean 0.0 and standard deviation 0.05.

Type

Optional[Callable[[torch.Tensor], None]]

combiner

The combiner operation used to aggregate bag of embeddings. Possible options are “mean”, “sum”, and “sqrtn”. By default “mean”.

Type

Optional[str]

name

The name of the lookup table. By default None.

Type

Optional[str]

class transformers4rec.torch.features.embedding.FeatureConfig(table: transformers4rec.torch.features.embedding.TableConfig, max_sequence_length: int = 0, name: Optional[str] = None)[source]

Bases: object

Class to set the embeddings table of a categorical feature with a maximum sequence length.

table

Configuration for the lookup table, which is used for embedding lookup and aggregation.

Type

TableConfig

max_sequence_length

Maximum sequence length for sequence features. By default 0.

Type

int, optional

name

The feature name. By default None

Type

str, optional

class transformers4rec.torch.features.embedding.SoftEmbedding(num_embeddings, embeddings_dim, emb_initializer=None)[source]

Bases: torch.nn.modules.module.Module

Soft-one hot encoding embedding technique, from https://arxiv.org/pdf/1708.00065.pdf In a nutshell, it represents a continuous feature as a weighted average of embeddings

forward(input_numeric)[source]
training: bool
class transformers4rec.torch.features.embedding.PretrainedEmbeddingsInitializer(weight_matrix: Union[torch.Tensor, List[List[float]]], trainable: bool = False, **kwargs)[source]

Bases: torch.nn.modules.module.Module

Initializer of embedding tables with pre-trained weights

Parameters
  • weight_matrix (Union[torch.Tensor, List[List[float]]]) – A 2D torch or numpy tensor or lists of lists with the pre-trained weights for embeddings. The expect dims are (embedding_cardinality, embedding_dim). The embedding_cardinality can be inferred from the column schema, for example, schema.select_by_name(“item_id”).feature[0].int_domain.max + 1. The first position of the embedding table is reserved for padded items (id=0).

  • trainable (bool) – Whether the embedding table should be trainable or not

forward(x)[source]
training: bool
class transformers4rec.torch.features.embedding.PretrainedEmbeddingFeatures(features: List[str], pretrained_output_dims: Optional[Union[int, Dict[str, int]]] = None, sequence_combiner: Optional[Union[str, torch.nn.modules.module.Module]] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, normalizer: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None)[source]

Bases: transformers4rec.torch.features.base.InputBlock

Input block for pre-trained embeddings features.

For 3-D features, if sequence_combiner is set, the features are aggregated using the second dimension (sequence length)

Parameters
  • features (List[str]) – A list of the pre-trained embeddings feature names. You typically will pass schema.select_by_tag(Tags.EMBEDDING).column_names, as that is the tag added to pre-trained embedding features when using the merlin.dataloader.ops.embeddings.EmbeddingOperator

  • pretrained_output_dims (Optional[Union[int, Dict[str, int]]]) – If provided, it projects features to specified dim(s). If an int, all features are projected to that dim. If a dict, only features provided in the dict will be mapped to the specified dim,

  • sequence_combiner (Optional[Union[str, torch.nn.Module]], optional) – A string (“mean”, “sum”, “max”, “min”) or torch.nn.Module specifying how to combine the second dimension of the pre-trained embeddings if it is 3D. Default is None (no sequence combiner used)

  • normalizer (Optional[Union[str, TabularTransformationType]]) – A tabular layer (e.g.tr.TabularLayerNorm()) or string (“layer-norm”) to be applied to pre-trained embeddings after projected and sequence combined Default is None (no normalization)

  • (Optional[Schema]) (schema) –

  • pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).

  • post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).

  • aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

build(input_size, **kwargs)[source]
classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]]]] = None, pretrained_output_dims=None, sequence_combiner=None, normalizer: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, **kwargs)[source]
forward(inputs)[source]
forward_output_size(input_sizes)[source]
parse_combiner(combiner)[source]

transformers4rec.torch.features.sequence module

class transformers4rec.torch.features.sequence.SequenceEmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], item_id: Optional[str] = None, padding_idx: int = 0, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None)[source]

Bases: transformers4rec.torch.features.embedding.EmbeddingFeatures

Input block for embedding-lookups for categorical features. This module produces 3-D tensors, this is useful for sequential models like transformers.

Parameters
  • feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features.

  • item_id (str, optional) – The name of the feature that’s used for the item_id.

  • padding_idx (int) – The symbol to use for padding.

  • pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).

  • post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).

  • aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig)torch.nn.modules.sparse.Embedding[source]
forward_output_size(input_sizes)[source]
class transformers4rec.torch.features.sequence.TabularSequenceFeatures(continuous_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, categorical_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, pretrained_embedding_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, projection_module: Optional[Union[transformers4rec.torch.block.base.BlockBase, transformers4rec.torch.block.base.BuildableBlock, torch.nn.modules.module.Module]] = None, masking: Optional[transformers4rec.torch.masking.MaskSequence] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]

Bases: transformers4rec.torch.features.tabular.TabularFeatures

Input module that combines different types of features to a sequence: continuous, categorical & text.

Parameters
  • continuous_module (TabularModule, optional) – Module used to process continuous features.

  • categorical_module (TabularModule, optional) – Module used to process categorical features.

  • text_embedding_module (TabularModule, optional) – Module used to process text features.

  • projection_module (BlockOrModule, optional) – Module that’s used to project the output of this module, typically done by an MLPBlock.

  • masking (MaskSequence, optional) – Masking to apply to the inputs.

  • pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward).

  • post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward).

  • aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor.

EMBEDDING_MODULE_CLASS

alias of transformers4rec.torch.features.sequence.SequenceEmbeddingFeatures

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.CATEGORICAL: 'categorical'>,), pretrained_embeddings_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.EMBEDDING: 'embedding'>,), aggregation: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, continuous_projection: Optional[Union[int, List[int]]] = None, continuous_soft_embeddings: bool = False, projection: Optional[Union[torch.nn.modules.module.Module, transformers4rec.torch.block.base.BuildableBlock]] = None, d_output: Optional[int] = None, masking: Optional[Union[str, transformers4rec.torch.masking.MaskSequence]] = None, **kwargs)transformers4rec.torch.features.sequence.TabularSequenceFeatures[source]

Instantiates TabularFeatures from a DatasetSchema

Parameters
  • schema (DatasetSchema) – Dataset schema

  • continuous_tags (Optional[Union[TagsType, Tuple[Tags]]], optional) – Tags to filter the continuous features, by default Tags.CONTINUOUS

  • categorical_tags (Optional[Union[TagsType, Tuple[Tags]]], optional) – Tags to filter the categorical features, by default Tags.CATEGORICAL

  • aggregation (Optional[str], optional) – Feature aggregation option, by default None

  • automatic_build (bool, optional) – Automatically infers input size from features, by default True

  • max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None

  • continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and project them by a number of MLP layers. The argument accepts a list with the dimensions of the MLP layers, by default None

  • continuous_soft_embeddings (bool) – Indicates if the soft one-hot encoding technique must be used to represent continuous features, by default False

  • projection (Optional[Union[torch.nn.Module, BuildableBlock]], optional) – If set, project the aggregated embeddings vectors into hidden dimension vector space, by default None

  • d_output (Optional[int], optional) – If set, init a MLPBlock as projection module to project embeddings vectors, by default None

  • masking (Optional[Union[str, MaskSequence]], optional) – If set, Apply masking to the input embeddings and compute masked labels, It requires a categorical_module including an item_id column, by default None

Returns

Returns TabularFeatures from a dataset schema

Return type

TabularFeatures

property masking
set_masking(value)[source]
property item_id
property item_embedding_table
forward(inputs, training=False, testing=False, **kwargs)[source]
project_continuous_features(dimensions)[source]
forward_output_size(input_size)[source]

transformers4rec.torch.features.tabular module

class transformers4rec.torch.features.tabular.TabularFeatures(continuous_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, categorical_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, pretrained_embedding_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]

Bases: transformers4rec.torch.tabular.base.MergeTabular

Input module that combines different types of features: continuous, categorical & text.

Parameters
  • continuous_module (TabularModule, optional) – Module used to process continuous features.

  • categorical_module (TabularModule, optional) – Module used to process categorical features.

  • text_embedding_module (TabularModule, optional) – Module used to process text features.

pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs when the module is called (so before forward).

post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional

Transformations to apply on the inputs after the module is called (so after forward).

aggregation: Union[str, TabularAggregation], optional

Aggregation to apply after processing the forward-method to output a single Tensor.

CONTINUOUS_MODULE_CLASS

alias of transformers4rec.torch.features.continuous.ContinuousFeatures

EMBEDDING_MODULE_CLASS

alias of transformers4rec.torch.features.embedding.EmbeddingFeatures

SOFT_EMBEDDING_MODULE_CLASS

alias of transformers4rec.torch.features.embedding.SoftEmbeddingFeatures

PRETRAINED_EMBEDDING_MODULE_CLASS

alias of transformers4rec.torch.features.embedding.PretrainedEmbeddingFeatures

project_continuous_features(mlp_layers_dims: Union[List[int], int])transformers4rec.torch.features.tabular.TabularFeatures[source]

Combine all concatenated continuous features with stacked MLP layers

Parameters

mlp_layers_dims (Union[List[int], int]) – The MLP layer dimensions

Returns

Returns the same TabularFeatures object with the continuous features projected

Return type

TabularFeatures

classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.CATEGORICAL: 'categorical'>,), pretrained_embeddings_tags: Optional[Union[merlin.schema.tags.TagSet, List[str], List[merlin.schema.tags.Tags], List[Union[str, merlin.schema.tags.Tags]], Tuple[merlin.schema.tags.Tags]]] = (<Tags.EMBEDDING: 'embedding'>,), aggregation: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, continuous_projection: Optional[Union[int, List[int]]] = None, continuous_soft_embeddings: bool = False, **kwargs)transformers4rec.torch.features.tabular.TabularFeatures[source]

Instantiates TabularFeatures from a DatasetSchema

Parameters
  • schema (DatasetSchema) – Dataset schema

  • continuous_tags (Optional[Union[TagsType, Tuple[Tags]]], optional) – Tags to filter the continuous features, by default Tags.CONTINUOUS

  • categorical_tags (Optional[Union[TagsType, Tuple[Tags]]], optional) – Tags to filter the categorical features, by default Tags.CATEGORICAL

  • aggregation (Optional[str], optional) – Feature aggregation option, by default None

  • automatic_build (bool, optional) – Automatically infers input size from features, by default True

  • max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None

  • continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and project them by a number of MLP layers. The argument accepts a list with the dimensions of the MLP layers, by default None

  • continuous_soft_embeddings (bool) – Indicates if the soft one-hot encoding technique must be used to represent continuous features, by default False

Returns

Returns TabularFeatures from a dataset schema

Return type

TabularFeatures

forward_output_size(input_size)[source]
property continuous_module
property categorical_module
property pretrained_module

transformers4rec.torch.features.text module

Module contents