transformers4rec.torch.features package
Submodules
transformers4rec.torch.features.base module
- 
class transformers4rec.torch.features.base.InputBlock(pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]
- Bases: - transformers4rec.torch.tabular.base.TabularBlock,- abc.ABC
transformers4rec.torch.features.continuous module
- 
class transformers4rec.torch.features.continuous.ContinuousFeatures(features: List[str], pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]
- Bases: - transformers4rec.torch.features.base.InputBlock- Input block for continuous features. - Parameters
- features (List[str]) – List of continuous features to include in this module. 
- pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward). 
- post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward). 
- aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor. 
 
 
transformers4rec.torch.features.embedding module
- 
class transformers4rec.torch.features.embedding.EmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], item_id: Optional[str] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None)[source]
- Bases: - transformers4rec.torch.features.base.InputBlock- Input block for embedding-lookups for categorical features. - For multi-hot features, the embeddings will be aggregated into a single tensor using the mean. - Parameters
- feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features. 
- item_id (str, optional) – The name of the feature that’s used for the item_id. 
 
 - pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional
- Transformations to apply on the inputs when the module is called (so before forward). 
- post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional
- Transformations to apply on the inputs after the module is called (so after forward). 
- aggregation: Union[str, TabularAggregation], optional
- Aggregation to apply after processing the forward-method to output a single Tensor. 
 - 
property item_embedding_table
 - 
table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig) → torch.nn.modules.module.Module[source]
 - 
classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, embedding_dims: Optional[Dict[str, int]] = None, embedding_dim_default: int = 64, infer_embedding_sizes: bool = False, infer_embedding_sizes_multiplier: float = 2.0, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, combiner: str = 'mean', tags: Optional[Union[merlin_standard_lib.schema.tag.Tag, list, str]] = None, item_id: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, aggregation=None, pre=None, post=None, **kwargs) → Optional[transformers4rec.torch.features.embedding.EmbeddingFeatures][source]
- Instantitates - EmbeddingFeaturesfrom a- DatasetSchema.- Parameters
- schema (DatasetSchema) – Dataset schema 
- embedding_dims (Optional[Dict[str, int]], optional) – The dimension of the embedding table for each feature (key), by default None by default None 
- default_embedding_dim (Optional[int], optional) – Default dimension of the embedding table, when the feature is not found in - default_soft_embedding_dim, by default 64
- infer_embedding_sizes (bool, optional) – Automatically defines the embedding dimension from the feature cardinality in the schema, by default False 
- infer_embedding_sizes_multiplier (Optional[int], by default 2.0) – multiplier used by the heuristic to infer the embedding dimension from its cardinality. Generally reasonable values range between 2.0 and 10.0 
- embeddings_initializers (Optional[Dict[str, Callable[[Any], None]]]) – Dict where keys are feature names and values are callable to initialize embedding tables 
- combiner (Optional[str], optional) – Feature aggregation option, by default “mean” 
- tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter columns, by default None 
- item_id (Optional[str], optional) – Name of the item id column (feature), by default None 
- automatic_build (bool, optional) – Automatically infers input size from features, by default True 
- max_sequence_length (Optional[int], optional) – Maximum sequence length for list features,, by default None 
 
- Returns
- Returns the - EmbeddingFeaturesfor the dataset schema
- Return type
- Optional[EmbeddingFeatures] 
 
 - 
item_ids(inputs) → torch.Tensor[source]
 
- 
class transformers4rec.torch.features.embedding.EmbeddingBagWrapper(num_embeddings: int, embedding_dim: int, max_norm: Optional[float] = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, mode: str = 'mean', sparse: bool = False, _weight: Optional[torch.Tensor] = None, include_last_offset: bool = False, padding_idx: Optional[int] = None, device=None, dtype=None)[source]
- Bases: - torch.nn.modules.sparse.EmbeddingBag- 
weight: torch.Tensor
 
- 
- 
class transformers4rec.torch.features.embedding.SoftEmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], layer_norm: bool = True, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, **kwarg)[source]
- Bases: - transformers4rec.torch.features.embedding.EmbeddingFeatures- Encapsulate continuous features encoded using the Soft-one hot encoding embedding technique (SoftEmbedding), from https://arxiv.org/pdf/1708.00065.pdf In a nutshell, it keeps an embedding table for each continuous feature, which is represented as a weighted average of embeddings. - Parameters
- feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features. 
- layer_norm (boolean) – When layer_norm is true, TabularLayerNorm will be used in post. 
- pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward). 
- post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward). 
- aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor. 
 
 - 
classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, soft_embedding_cardinalities: Optional[Dict[str, int]] = None, soft_embedding_cardinality_default: int = 10, soft_embedding_dims: Optional[Dict[str, int]] = None, soft_embedding_dim_default: int = 8, embeddings_initializers: Optional[Dict[str, Callable[[Any], None]]] = None, layer_norm: bool = True, combiner: str = 'mean', tags: Optional[Union[merlin_standard_lib.schema.tag.Tag, list, str]] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, **kwargs) → Optional[transformers4rec.torch.features.embedding.SoftEmbeddingFeatures][source]
- Instantitates - SoftEmbeddingFeaturesfrom a- DatasetSchema.- Parameters
- schema (DatasetSchema) – Dataset schema 
- soft_embedding_cardinalities (Optional[Dict[str, int]], optional) – The cardinality of the embedding table for each feature (key), by default None 
- soft_embedding_cardinality_default (Optional[int], optional) – Default cardinality of the embedding table, when the feature is not found in - soft_embedding_cardinalities, by default 10
- soft_embedding_dims (Optional[Dict[str, int]], optional) – The dimension of the embedding table for each feature (key), by default None 
- soft_embedding_dim_default (Optional[int], optional) – Default dimension of the embedding table, when the feature is not found in - soft_embedding_dim_default, by default 8
- embeddings_initializers (Optional[Dict[str, Callable[[Any], None]]]) – Dict where keys are feature names and values are callable to initialize embedding tables 
- combiner (Optional[str], optional) – Feature aggregation option, by default “mean” 
- tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter columns, by default None 
- automatic_build (bool, optional) – Automatically infers input size from features, by default True 
- max_sequence_length (Optional[int], optional) – Maximum sequence length for list features, by default None 
 
- Returns
- Returns a - SoftEmbeddingFeaturesinstance from the dataset schema
- Return type
- Optional[SoftEmbeddingFeatures] 
 
 - 
table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig) → transformers4rec.torch.features.embedding.SoftEmbedding[source]
 
- 
class transformers4rec.torch.features.embedding.TableConfig(vocabulary_size: int, dim: int, initializer: Optional[Callable[[torch.Tensor], None]] = None, combiner: str = 'mean', name: Optional[str] = None)[source]
- Bases: - object
- 
class transformers4rec.torch.features.embedding.FeatureConfig(table: transformers4rec.torch.features.embedding.TableConfig, max_sequence_length: int = 0, name: Optional[str] = None)[source]
- Bases: - object
- 
class transformers4rec.torch.features.embedding.SoftEmbedding(num_embeddings, embeddings_dim, emb_initializer=None)[source]
- Bases: - torch.nn.modules.module.Module- Soft-one hot encoding embedding technique, from https://arxiv.org/pdf/1708.00065.pdf In a nutshell, it represents a continuous feature as a weighted average of embeddings 
transformers4rec.torch.features.sequence module
- 
class transformers4rec.torch.features.sequence.SequenceEmbeddingFeatures(feature_config: Dict[str, transformers4rec.torch.features.embedding.FeatureConfig], item_id: Optional[str] = None, padding_idx: int = 0, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None)[source]
- Bases: - transformers4rec.torch.features.embedding.EmbeddingFeatures- Input block for embedding-lookups for categorical features. This module produces 3-D tensors, this is useful for sequential models like transformers. - Parameters
- feature_config (Dict[str, FeatureConfig]) – This specifies what TableConfig to use for each feature. For shared embeddings, the same TableConfig can be used for multiple features. 
- item_id (str, optional) – The name of the feature that’s used for the item_id. 
- padding_idx (int) – The symbol to use for padding. 
- pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward). 
- post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward). 
- aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor. 
 
 - 
table_to_embedding_module(table: transformers4rec.torch.features.embedding.TableConfig) → torch.nn.modules.sparse.Embedding[source]
 
- 
class transformers4rec.torch.features.sequence.TabularSequenceFeatures(continuous_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, categorical_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, text_embedding_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, projection_module: Optional[Union[transformers4rec.torch.block.base.BlockBase, transformers4rec.torch.block.base.BuildableBlock, torch.nn.modules.module.Module]] = None, masking: Optional[transformers4rec.torch.masking.MaskSequence] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]
- Bases: - transformers4rec.torch.features.tabular.TabularFeatures- Input module that combines different types of features to a sequence: continuous, categorical & text. - Parameters
- continuous_module (TabularModule, optional) – Module used to process continuous features. 
- categorical_module (TabularModule, optional) – Module used to process categorical features. 
- text_embedding_module (TabularModule, optional) – Module used to process text features. 
- projection_module (BlockOrModule, optional) – Module that’s used to project the output of this module, typically done by an MLPBlock. 
- masking (MaskSequence, optional) – Masking to apply to the inputs. 
- pre (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs when the module is called (so before forward). 
- post (Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional) – Transformations to apply on the inputs after the module is called (so after forward). 
- aggregation (Union[str, TabularAggregation], optional) – Aggregation to apply after processing the forward-method to output a single Tensor. 
 
 - 
EMBEDDING_MODULE_CLASS
- alias of - transformers4rec.torch.features.sequence.SequenceEmbeddingFeatures
 - 
classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CATEGORICAL: 'categorical'>,), aggregation: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, continuous_projection: Optional[Union[List[int], int]] = None, continuous_soft_embeddings: bool = False, projection: Optional[Union[torch.nn.modules.module.Module, transformers4rec.torch.block.base.BuildableBlock]] = None, d_output: Optional[int] = None, masking: Optional[Union[str, transformers4rec.torch.masking.MaskSequence]] = None, **kwargs) → transformers4rec.torch.features.sequence.TabularSequenceFeatures[source]
- Instantiates - TabularFeaturesfrom a- DatasetSchema- Parameters
- schema (DatasetSchema) – Dataset schema 
- continuous_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the continuous features, by default Tag.CONTINUOUS 
- categorical_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the categorical features, by default Tag.CATEGORICAL 
- aggregation (Optional[str], optional) – Feature aggregation option, by default None 
- automatic_build (bool, optional) – Automatically infers input size from features, by default True 
- max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None 
- continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and projet them by a number of MLP layers. The argument accepts a list with the dimensions of the MLP layers, by default None 
- continuous_soft_embeddings (bool) – Indicates if the soft one-hot encoding technique must be used to represent continuous features, by default False 
- projection (Optional[Union[torch.nn.Module, BuildableBlock]], optional) – If set, project the aggregated embeddings vectors into hidden dimension vector space, by default None 
- d_output (Optional[int], optional) – If set, init a MLPBlock as projection module to project embeddings vectors, by default None 
- masking (Optional[Union[str, MaskSequence]], optional) – If set, Apply masking to the input embeddings and compute masked labels, It requires a categorical_module including an item_id column, by default None 
 
- Returns
- Returns - TabularFeaturesfrom a dataset schema
- Return type
 
 - 
property masking
 - 
property item_id
 - 
property item_embedding_table
 
transformers4rec.torch.features.tabular module
- 
class transformers4rec.torch.features.tabular.TabularFeatures(continuous_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, categorical_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, text_embedding_module: Optional[transformers4rec.torch.tabular.base.TabularModule] = None, pre: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, post: Optional[Union[str, transformers4rec.torch.tabular.base.TabularTransformation]] = None, aggregation: Optional[Union[str, transformers4rec.torch.tabular.base.TabularAggregation]] = None, schema: Optional[merlin_standard_lib.schema.schema.Schema] = None, **kwargs)[source]
- Bases: - transformers4rec.torch.tabular.base.MergeTabular- Input module that combines different types of features: continuous, categorical & text. - Parameters
- continuous_module (TabularModule, optional) – Module used to process continuous features. 
- categorical_module (TabularModule, optional) – Module used to process categorical features. 
- text_embedding_module (TabularModule, optional) – Module used to process text features. 
 
 - pre: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional
- Transformations to apply on the inputs when the module is called (so before forward). 
- post: Union[str, TabularTransformation, List[str], List[TabularTransformation]], optional
- Transformations to apply on the inputs after the module is called (so after forward). 
- aggregation: Union[str, TabularAggregation], optional
- Aggregation to apply after processing the forward-method to output a single Tensor. 
 - 
CONTINUOUS_MODULE_CLASS
- alias of - transformers4rec.torch.features.continuous.ContinuousFeatures
 - 
EMBEDDING_MODULE_CLASS
- alias of - transformers4rec.torch.features.embedding.EmbeddingFeatures
 - 
SOFT_EMBEDDING_MODULE_CLASS
- alias of - transformers4rec.torch.features.embedding.SoftEmbeddingFeatures
 - 
project_continuous_features(mlp_layers_dims: Union[List[int], int]) → transformers4rec.torch.features.tabular.TabularFeatures[source]
- Combine all concatenated continuous features with stacked MLP layers 
 - 
classmethod from_schema(schema: merlin_standard_lib.schema.schema.Schema, continuous_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CONTINUOUS: 'continuous'>,), categorical_tags: Optional[Union[List[str], List[merlin_standard_lib.schema.tag.Tag], List[Union[merlin_standard_lib.schema.tag.Tag, str]], Tuple[merlin_standard_lib.schema.tag.Tag]]] = (<Tag.CATEGORICAL: 'categorical'>,), aggregation: Optional[str] = None, automatic_build: bool = True, max_sequence_length: Optional[int] = None, continuous_projection: Optional[Union[List[int], int]] = None, continuous_soft_embeddings: bool = False, **kwargs) → transformers4rec.torch.features.tabular.TabularFeatures[source]
- Instantiates - TabularFeaturesfrom a- DatasetSchema- Parameters
- schema (DatasetSchema) – Dataset schema 
- continuous_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the continuous features, by default Tag.CONTINUOUS 
- categorical_tags (Optional[Union[DefaultTags, list, str]], optional) – Tags to filter the categorical features, by default Tag.CATEGORICAL 
- aggregation (Optional[str], optional) – Feature aggregation option, by default None 
- automatic_build (bool, optional) – Automatically infers input size from features, by default True 
- max_sequence_length (Optional[int], optional) – Maximum sequence length for list features by default None 
- continuous_projection (Optional[Union[List[int], int]], optional) – If set, concatenate all numerical features and project them by a number of MLP layers. The argument accepts a list with the dimensions of the MLP layers, by default None 
- continuous_soft_embeddings (bool) – Indicates if the soft one-hot encoding technique must be used to represent continuous features, by default False 
 
- Returns
- Returns - TabularFeaturesfrom a dataset schema
- Return type
 
 - 
property continuous_module
 - 
property categorical_module